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Preface 

Biology is designed for multi-semester biology courses for science majors. 
It is grounded on an evolutionary basis and includes exciting features that 
highlight careers in the biological sciences and everyday applications of the 
concepts at hand. To meet the needs of today’s instructors and students, 
some content has been strategically condensed while maintaining the 
overall scope and coverage of traditional texts for this course. Instructors 
can customize the book, adapting it to the approach that works best in their 
classroom. Biology also includes an innovative art program that 
incorporates critical thinking and clicker questions to help students 
understand—and apply—key concepts. 


Welcome to Biology, an OpenStax resource. This textbook was written to 
increase student access to high-quality learning materials, maintaining 
highest standards of academic rigor at little to no cost. 


About OpenStax 


OpenStax is a nonprofit based at Rice University, and it’s our mission to 
improve student access to education. Our first openly licensed college 
textbook was published in 2012, and our library has since scaled to over 20 
books for college and AP courses used by hundreds of thousands of 
students. Our adaptive learning technology, designed to improve learning 
outcomes through personalized educational paths, is being piloted in 
college courses throughout the country. Through our partnerships with 
philanthropic foundations and our alliance with other educational resource 
organizations, OpenStax is breaking down the most common barriers to 
learning and empowering students and instructors to succeed. 


About OpenStax’s Resources 


Customization 


Biology is licensed under a Creative Commons Attribution 4.0 International 
(CC BY) license, which means that you can distribute, remix, and build 


upon the content, as long as you provide attribution to OpenStax and its 
content contributors. 


Because our books are openly licensed, you are free to use the entire book 
or pick and choose the sections that are most relevant to the needs of your 
course. Feel free to remix the content by assigning your students certain 
chapters and sections in your syllabus, in the order that you prefer. You can 
even provide a direct link in your syllabus to the sections in the web view of 
your book. 


Instructors also have the option of creating a customized version of their 
OpenStax book. The custom version can be made available to students in 
low-cost print or digital form through their campus bookstore. Visit your 
book page on openstax.org for more information. 


Errata 


All OpenStax textbooks undergo a rigorous review process. However, like 
any professional-grade textbook, errors sometimes occur. Since our books 
are web based, we can make updates periodically when deemed 
pedagogically necessary. If you have a correction to suggest, submit it 
through the link on your book page on openstax.org. Subject matter experts 
review all errata suggestions. OpenStax is committed to remaining 
transparent about all updates, so you will also find a list of past errata 
changes on your book page on openstax.org. 


Format 


You can access this textbook for free in web view or PDF through 
openstax.org, and in low-cost print and iBooks editions. 


About Biology 


Biology is designed to cover the scope and sequence requirements of a 
typical two-semester biology course for science majors. The text provides 
comprehensive coverage of foundational research and core biology 
concepts through an evolutionary lens. Biology includes rich features that 
engage students in scientific inquiry, highlight careers in the biological 
sciences, and offer everyday applications. The book also includes clicker 
questions to help students understand—and apply—key concepts. 


Coverage and Scope 


In developing Biology, we listened to hundreds of General Biology 
instructors who readily provided feedback about their courses, students, 
challenges, and hopes for innovation. The expense of textbooks and related 
items did prove to be a barrier to learning. But more importantly, these 
teachers suggested improvements for the textbook, which would ultimately 
lead to more meaningful and memorable learning experiences for students. 


The result is a book that addresses a core organizational reality of the 
course and its materials—the sheer breadth of the topical coverage. We 
provide a thorough treatment of biology’s foundational concepts while 
condensing selected topics in response to the market’s request for a 
textbook with a scope that is manageable for instructors and students alike. 
We also strive to make biology, as a discipline, interesting and accessible to 
students. In addition to a comprehensive coverage of core concepts and 
foundational research, we have incorporated features that draw learners into 
the discipline in meaningful ways. 


The pedagogical choices, chapter arrangements, and learning objective 
fulfillment were developed and vetted with the feedback of another one 
hundred reviewers, who thoroughly read the material and offered detailed 
critical commentary. 


Unit 1: The Chemistry of Life. Our opening unit introduces students 
to the sciences, including the scientific method and the fundamental 
concepts of chemistry and physics that provide a framework within 
which learners comprehend biological processes. 


Unit 2: The Cell. Students will gain solid understanding of the 
structures, functions, and processes of the most basic unit of life: the 
cell. 

Unit 3: Genetics. Our comprehensive genetics unit takes learners from 
the earliest experiments that revealed the basis of genetics through the 
intricacies of DNA to current applications in the emerging studies of 
biotechnology and genomics. 

Unit 4: Evolutionary Processes. The core concepts of evolution are 
discussed in this unit with examples illustrating evolutionary 
processes. Additionally, the evolutionary basis of biology reappears 
throughout the textbook in general discussion and is reinforced 
through special call-out features highlighting specific evolution-based 
topics. 

Unit 5: Biological Diversity. The diversity of life is explored with 
detailed study of various organisms and discussion of emerging 
phylogenetic relationships. This unit moves from viruses to living 
organisms like bacteria, discusses the organisms formerly grouped as 
protists, and devotes multiple chapters to plant and animal life. 

Unit 6: Plant Structure and Function. Our plant unit thoroughly 
covers the fundamental knowledge of plant life essential to an 
introductory biology course. 

Unit 7: Animal Structure and Function. An introduction to the form 
and function of the animal body is followed by chapters on specific 
body systems and processes. This unit touches on the biology of all 
organisms while maintaining an engaging focus on human anatomy 
and physiology that helps students connect to the topics. 

Unit 8: Ecology. Ecological concepts are broadly covered in this unit, 
with features highlighting localized, real-world issues of conservation 
and biodiversity. 


Pedagogical Foundation and Features 


Biology is grounded in a solid scientific base, with features that engage the 
students in scientific inquiry, including: 


Evolution Connection features uphold the importance of evolution to 
all biological study through discussions like “The Evolution of 
Metabolic Pathways” and “Algae and Evolutionary Paths to 
Photosynthesis.” 

Scientific Method Connection call-outs walk students through actual 
or thought experiments that elucidate the steps of the scientific process 
as applied to the topic. Features include “Determining the Time Spent 
in Cell Cycle Stages” and “Testing the Hypothesis of Independent 
Assortment.” 

Career Connection features present information on a variety of 
careers in the biological sciences, introducing students to the 
educational requirements and day-to-day work life of a variety of 
professions, such as microbiologist, ecologist, neurologist, and 
forensic scientist. 

Everyday Connection features tie biological concepts to emerging 
issues and discuss science in terms of everyday life. Topics include 
“Chesapeake Bay” and “Can Snail Venom Be Used as a 
Pharmacological Pain Killer?” 


Art and Animations That Engage 


Our art program takes a straightforward approach designed to help students 
learn the concepts of biology through simple, effective illustrations, photos, 
and micrographs. Biology also incorporates links to relevant animations and 
interactive exercises that help bring biology to life for students. 


Art Connection features call out core figures in each chapter for 
student study. Questions about key figures, including clicker questions 
that can be used in the classroom, engage students’ critical thinking to 
ensure genuine understanding. 

Link to Learning features direct students to online interactive 
exercises and animations to add a fuller context to core content. 


Additional Resources 


Student and Instructor Resources 

We've compiled additional resources for both students and instructors, 
including Getting Started Guides, an instructor solution manual, 
supplemental test items, and PowerPoint slides. Instructor resources require 
a verified instructor account, which can be requested on your openstax.org 
log-in. Take advantage of these resources to supplement your OpenStax 
book. 


Partner Resources 

OpenStax Partners are our allies in the mission to make high-quality 
learning materials affordable and accessible to students and instructors 
everywhere. Their tools integrate seamlessly with our OpenStax titles at a 
low cost. To access the partner resources for your text, visit your book page 
on openstax.org. 
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Experimentin 
g with 
thousands of 
garden peas, 
Mendel 
uncovered the 
fundamentals 
of genetics. 
(credit: 
modification 
of work by 
Jerry 
Kirkhart) 


Genetics is the study of heredity. Johann Gregor Mendel set the framework 
for genetics long before chromosomes or genes had been identified, at a 
time when meiosis was not well understood. Mendel selected a simple 
biological system and conducted methodical, quantitative analyses using 


large sample sizes. Because of Mendel’s work, the fundamental principles 
of heredity were revealed. We now know that genes, carried on 
chromosomes, are the basic functional units of heredity with the capability 
to be replicated, expressed, or mutated. Today, the postulates put forth by 
Mendel form the basis of classical, or Mendelian, genetics. Not all genes 
are transmitted from parents to offspring according to Mendelian genetics, 
but Mendel’s experiments serve as an excellent starting point for thinking 
about inheritance. 


Mendel’s Experiments and the Laws of Probability 
By the end of this section, you will be able to: 


¢ Describe the scientific reasons for the success of Mendel’s experimental work 

¢ Describe the expected outcomes of monohybrid crosses involving dominant and 
recessive alleles 

e Apply the sum and product rules to calculate probabilities 


Johann Gregor Mendel is 
considered the father of 
genetics. 


Johann Gregor Mendel (1822-1884) ({link]) was a lifelong learner, teacher, scientist, 
and man of faith. As a young adult, he joined the Augustinian Abbey of St. Thomas in 
Brno in what is now the Czech Republic. Supported by the monastery, he taught 
physics, botany, and natural science courses at the secondary and university levels. In 
1856, he began a decade-long research pursuit involving inheritance patterns in 
honeybees and plants, ultimately settling on pea plants as his primary model system (a 
system with convenient characteristics used to study a specific biological phenomenon 
to be applied to other systems). In 1865, Mendel presented the results of his 
experiments with nearly 30,000 pea plants to the local Natural History Society. He 
demonstrated that traits are transmitted faithfully from parents to offspring 
independently of other traits and in dominant and recessive patterns. In 1866, he 
published his work, Experiments in Plant Hybridization,'°“°"! in the proceedings of 
the Natural History Society of Briinn. 

Johann Gregor Mendel, Versuche tiber Pflanzenhybriden Verhandlungen des 
naturforschenden Vereines in Briinn, Bd. IV ftir das Jahr, 1865 Abhandlungen, 3-47. 


[for English translation see http://www.mendelweb.org/Mendel.plain. html] 


Mendel’s work went virtually unnoticed by the scientific community that believed, 
incorrectly, that the process of inheritance involved a blending of parental traits that 
produced an intermediate physical appearance in offspring; this hypothetical process 
appeared to be correct because of what we know now as continuous variation. 
Continuous variation results from the action of many genes to determine a 
characteristic like human height. Offspring appear to be a “blend” of their parents’ 
traits when we look at characteristics that exhibit continuous variation. The blending 
theory of inheritance asserted that the original parental traits were lost or absorbed by 
the blending in the offspring, but we now know that this is not the case. Mendel was the 
first researcher to see it. Instead of continuous characteristics, Mendel worked with 
traits that were inherited in distinct classes (specifically, violet versus white flowers); 
this is referred to as discontinuous variation. Mendel’s choice of these kinds of traits 
allowed him to see experimentally that the traits were not blended in the offspring, nor 
were they absorbed, but rather that they kept their distinctness and could be passed on. 
In 1868, Mendel became abbot of the monastery and exchanged his scientific pursuits 
for his pastoral duties. He was not recognized for his extraordinary scientific 
contributions during his lifetime. In fact, it was not until 1900 that his work was 
rediscovered, reproduced, and revitalized by scientists on the brink of discovering the 
chromosomal basis of heredity. 


Mendel’s Model System 


Mendel’s seminal work was accomplished using the garden pea, Pisum sativum, to 
study inheritance. This species naturally self-fertilizes, such that pollen encounters ova 
within individual flowers. The flower petals remain sealed tightly until after 
pollination, preventing pollination from other plants. The result is highly inbred, or 
“true-breeding,” pea plants. These are plants that always produce offspring that look 
like the parent. By experimenting with true-breeding pea plants, Mendel avoided the 
appearance of unexpected traits in offspring that might occur if the plants were not true 
breeding. The garden pea also grows to maturity within one season, meaning that 
several generations could be evaluated over a relatively short time. Finally, large 
quantities of garden peas could be cultivated simultaneously, allowing Mendel to 
conclude that his results did not come about simply by chance. 


Mendelian Crosses 


Mendel performed hybridizations, which involve mating two true-breeding individuals 
that have different traits. In the pea, which is naturally self-pollinating, this is done by 
manually transferring pollen from the anther of a mature pea plant of one variety to the 
stigma of a separate mature pea plant of the second variety. In plants, pollen carries the 


male gametes (sperm) to the stigma, a sticky organ that traps pollen and allows the 
sperm to move down the pistil to the female gametes (ova) below. To prevent the pea 
plant that was receiving pollen from self-fertilizing and confounding his results, 
Mendel painstakingly removed all of the anthers from the plant’s flowers before they 
had a chance to mature. 


Plants used in first-generation crosses were called Po, or parental generation one, plants 
({link]). Mendel collected the seeds belonging to the Pg plants that resulted from each 
cross and grew them the following season. These offspring were called the F,, or the 
first filial (filial = offspring, daughter or son), generation. Once Mendel examined the 
characteristics in the F, generation of plants, he allowed them to self-fertilize naturally. 
He then collected and grew the seeds from the F, plants to produce the F», or second 
filial, generation. Mendel’s experiments extended beyond the F> generation to the F3 
and F, generations, and so on, but it was the ratio of characteristics in the Pp-F,—F> 
generations that were the most intriguing and became the basis for Mendel’s postulates. 


Hybridization of true-breeding plants 


Self-fertilization of hybrid plants 


In one of his experiments on 
inheritance patterns, Mendel 
crossed plants that were true- 
breeding for violet flower color 
with plants true-breeding for white 
flower color (the P generation). 
The resulting hybrids in the F, 
generation all had violet flowers. 
In the F> generation, 
approximately three quarters of 
the plants had violet flowers, and 
one quarter had white flowers. 


Garden Pea Characteristics Revealed the Basics of Heredity 


In his 1865 publication, Mendel reported the results of his crosses involving seven 
different characteristics, each with two contrasting traits. A trait is defined as a 
variation in the physical appearance of a heritable characteristic. The characteristics 
included plant height, seed texture, seed color, flower color, pea pod size, pea pod 

color, and flower position. For the characteristic of flower color, for example, the two 
contrasting traits were white versus violet. To fully examine each characteristic, Mendel 
generated large numbers of F, and F> plants, reporting results from 19,959 F, plants 
alone. His findings were consistent. 


What results did Mendel find in his crosses for flower color? First, Mendel confirmed 
that he had plants that bred true for white or violet flower color. Regardless of how 
many generations Mendel examined, all self-crossed offspring of parents with white 
flowers had white flowers, and all self-crossed offspring of parents with violet flowers 
had violet flowers. In addition, Mendel confirmed that, other than flower color, the pea 
plants were physically identical. 


Once these validations were complete, Mendel applied the pollen from a plant with 
violet flowers to the stigma of a plant with white flowers. After gathering and sowing 
the seeds that resulted from this cross, Mendel found that 100 percent of the F; hybrid 
generation had violet flowers. Conventional wisdom at that time would have predicted 
the hybrid flowers to be pale violet or for hybrid plants to have equal numbers of white 
and violet flowers. In other words, the contrasting parental traits were expected to blend 
in the offspring. Instead, Mendel’s results demonstrated that the white flower trait in the 
F, generation had completely disappeared. 


Importantly, Mendel did not stop his experimentation there. He allowed the F, plants to 
self-fertilize and found that, of F)-generation plants, 705 had violet flowers and 224 had 
white flowers. This was a ratio of 3.15 violet flowers per one white flower, or 
approximately 3:1. When Mendel transferred pollen from a plant with violet flowers to 
the stigma of a plant with white flowers and vice versa, he obtained about the same 
ratio regardless of which parent, male or female, contributed which trait. This is called 
a reciprocal cross—a paired cross in which the respective traits of the male and female 
in one cross become the respective traits of the female and male in the other cross. For 
the other six characteristics Mendel examined, the F; and F> generations behaved in the 
same way as they had for flower color. One of the two traits would disappear 
completely from the F, generation only to reappear in the F> generation at a ratio of 
approximately 3:1 ({link]). 


The Results of Mendel’s Garden Pea Hybridizations 


Fy F) 
Contrasting Offspring F, Offspring Trait 
Characteristic Po Traits Traits Traits Ratios 
; 100 ; 
Flower color ee percent AIS Mor 3.1571 
white ; 224 white di 
violet 
Flower Axial vs. ry Lae 
Ss ; percent 207 3.14:1 
position terminal ‘ , 
axial terminal 
100 
: Tall vs. 787 tall ; 
Plant height Aarach percent 277 dwarf 2.84:1 
tall 
Round vs ae a 
Seed texture Sai aled percent 1,850 2.96:1 
round : 
wrinkled 
Yellow vs es ate 
Seed color lee percent eens 3.01:1 
green 2,001 
yellow 
green 
100 882 
Pea pod Inflated vs. inflated ; 
: percent 2.95:1 
texture constricted : 299 
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The Results of Mendel’s Garden Pea Hybridizations 


Fy F) 
Contrasting Offspring F, Offspring Trait 
Characteristic Py Traits Traits Traits Ratios 
ee 100 428 green 
Pea pod color ; percent 152 2.82:1 
yellow 
green yellow 


Upon compiling his results for many thousands of plants, Mendel concluded that the 
characteristics could be divided into expressed and latent traits. He called these, 
respectively, dominant and recessive traits. Dominant traits are those that are inherited 
unchanged in a hybridization. Recessive traits become latent, or disappear, in the 
offspring of a hybridization. The recessive trait does, however, reappear in the progeny 
of the hybrid offspring. An example of a dominant trait is the violet-flower trait. For 
this same characteristic (flower color), white-colored flowers are a recessive trait. The 
fact that the recessive trait reappeared in the F> generation meant that the traits 
remained separate (not blended) in the plants of the F, generation. Mendel also 
proposed that plants possessed two copies of the trait for the flower-color characteristic, 
and that each parent transmitted one of its two copies to its offspring, where they came 
together. Moreover, the physical observation of a dominant trait could mean that the 
genetic composition of the organism included two dominant versions of the 
characteristic or that it included one dominant and one recessive version. Conversely, 
the observation of a recessive trait meant that the organism lacked any dominant 
versions of this characteristic. 


So why did Mendel repeatedly obtain 3:1 ratios in his crosses? To understand how 
Mendel deduced the basic mechanisms of inheritance that lead to such ratios, we must 
first review the laws of probability. 


Probability Basics 


Probabilities are mathematical measures of likelihood. The empirical probability of an 
event is calculated by dividing the number of times the event occurs by the total 
number of opportunities for the event to occur. It is also possible to calculate theoretical 
probabilities by dividing the number of times that an event is expected to occur by the 
number of times that it could occur. Empirical probabilities come from observations, 
like those of Mendel. Theoretical probabilities come from knowing how the events are 
produced and assuming that the probabilities of individual outcomes are equal. A 


probability of one for some event indicates that it is guaranteed to occur, whereas a 
probability of zero indicates that it is guaranteed not to occur. An example of a genetic 
event is around seed produced by a pea plant. In his experiment, Mendel demonstrated 
that the probability of the event “round seed” occurring was one in the F, offspring of 
true-breeding parents, one of which has round seeds and one of which has wrinkled 
seeds. When the F, plants were subsequently self-crossed, the probability of any given 
F, offspring having round seeds was now three out of four. In other words, in a large 
population of F offspring chosen at random, 75 percent were expected to have round 
seeds, whereas 25 percent were expected to have wrinkled seeds. Using large numbers 
of crosses, Mendel was able to calculate probabilities and use these to predict the 
outcomes of other crosses. 


The Product Rule and Sum Rule 


Mendel demonstrated that the pea-plant characteristics he studied were transmitted as 
discrete units from parent to offspring. As will be discussed, Mendel also determined 
that different characteristics, like seed color and seed texture, were transmitted 
independently of one another and could be considered in separate probability analyses. 
For instance, performing a cross between a plant with green, wrinkled seeds and a plant 
with yellow, round seeds still produced offspring that had a 3:1 ratio of green:yellow 
seeds (ignoring seed texture) and a 3:1 ratio of round:wrinkled seeds (ignoring seed 
color). The characteristics of color and texture did not influence each other. 


The product rule of probability can be applied to this phenomenon of the independent 
transmission of characteristics. The product rule states that the probability of two 
independent events occurring together can be calculated by multiplying the individual 
probabilities of each event occurring alone. To demonstrate the product rule, imagine 
that you are rolling a six-sided die (D) and flipping a penny (P) at the same time. The 
die may roll any number from 1—6 (Dx), whereas the penny may turn up heads (Py) or 
tails (P-;). The outcome of rolling the die has no effect on the outcome of flipping the 
penny and vice versa. There are 12 possible outcomes of this action ([link]), and each 
event is expected to occur with equal probability. 


Twelve Equally Likely Outcomes of Rolling a Die and Flipping a Penny 


Rolling Die Flipping Penny 


Twelve Equally Likely Outcomes of Rolling a Die and Flipping a Penny 


Rolling Die Flipping Penny 
D, Py 
D, Pr 
D2 Pu 
D> Pr 
D3 Pu 
Ds Pr 
Dy Pu 
Dy Pr 
Ds Py 
Ds; Pr 
Dg Py 
Dé Pr 


Of the 12 possible outcomes, the die has a 2/12 (or 1/6) probability of rolling a two, and 
the penny has a 6/12 (or 1/2) probability of coming up heads. By the product rule, the 
probability that you will obtain the combined outcome 2 and heads is: (D>) x (Py) = 
(1/6) x (1/2) or 1/12 ({link]). Notice the word “and” in the description of the probability. 
The “and” is a signal to apply the product rule. For example, consider how the product 
rule is applied to the dihybrid cross: the probability of having both dominant traits in 
the F progeny is the product of the probabilities of having the dominant trait for each 
characteristic, as shown here: 

Equation: 


3 . 3 = he 


On the other hand, the sum rule of probability is applied when considering two 
mutually exclusive outcomes that can come about by more than one pathway. The sum 
rule states that the probability of the occurrence of one event or the other event, of two 
mutually exclusive events, is the sum of their individual probabilities. Notice the word 
“or” in the description of the probability. The “or” indicates that you should apply the 
sum rule. In this case, let’s imagine you are flipping a penny (P) and a quarter (Q). 
What is the probability of one coin coming up heads and one coin coming up tails? This 
outcome can be achieved by two cases: the penny may be heads (Py) and the quarter 
may be tails (Qy), or the quarter may be heads (Q};) and the penny may be tails (P+). 
Either case fulfills the outcome. By the sum rule, we calculate the probability of 
obtaining one head and one tail as [(Py) < (Qy)] + (Quy) x (Pr)] = L(1/2) x (1/2)] + 
[(1/2) x (1/2)] = 1/2 ([link]). You should also notice that we used the product rule to 
calculate the probability of Py and Qy, and also the probability of Py and Quy, before we 
summed them. Again, the sum rule can be applied to show the probability of having 
just one dominant trait in the F generation of a dihybrid cross: 


Equation: 
3 3/ — 15 
ag ~ 4 - ig 


The Product Rule and Sum Rule 


Product Rule Sum Rule 

For independent events A and B, the For mutually exclusive events A and 
probability (P) of them both B, the probability (P) that at least one 
occurring (A and B) is (Pa x Pp) occurs (A or B) is (Pa + Pp) 


To use probability laws in practice, it is necessary to work with large sample sizes 
because small sample sizes are prone to deviations caused by chance. The large 
quantities of pea plants that Mendel examined allowed him calculate the probabilities 
of the traits appearing in his Fy generation. As you will learn, this discovery meant that 
when parental traits were known, the offspring’s traits could be predicted accurately 
even before fertilization. 


Section Summary 


Working with garden pea plants, Mendel found that crosses between parents that 
differed by one trait produced F, offspring that all expressed the traits of one parent. 
Observable traits are referred to as dominant, and non-expressed traits are described as 
recessive. When the offspring in Mendel’s experiment were self-crossed, the F> 
offspring exhibited the dominant trait or the recessive trait in a 3:1 ratio, confirming 
that the recessive trait had been transmitted faithfully from the original Pp parent. 
Reciprocal crosses generated identical F; and F> offspring ratios. By examining sample 
sizes, Mendel showed that his crosses behaved reproducibly according to the laws of 
probability, and that the traits were inherited as independent events. 


Two rules in probability can be used to find the expected proportions of offspring of 
different traits from different crosses. To find the probability of two or more 
independent events occurring together, apply the product rule and multiply the 
probabilities of the individual events. The use of the word “and” suggests the 
appropriate application of the product rule. To find the probability of two or more 
events occurring in combination, apply the sum rule and add their individual 
probabilities together. The use of the word “or” suggests the appropriate application of 
the sum rule. 


Review Questions 


Exercise: 


Problem: 


Mendel performed hybridizations by transferring pollen from the of the 
male plant to the female ova. 


a. anther 
b. pistil 
c. stigma 
d. seed 


Solution: 


A 
Exercise: 


Problem: 
Which is one of the seven characteristics that Mendel observed in pea plants? 


a. flower size 


b. seed texture 
c. leaf shape 
d. stem color 


Solution: 


B 
Exercise: 
Problem: 
Imagine you are performing a cross involving seed color in garden pea plants. 


What F, offspring would you expect if you cross true-breeding parents with green 
seeds and yellow seeds? Yellow seed color is dominant over green. 


a. 100 percent yellow-green seeds 

b. 100 percent yellow seeds 

c. 50 percent yellow, 50 percent green seeds 
d. 25 percent green, 75 percent yellow seeds 


Solution: 


B 
Exercise: 


Problem: 


Consider a cross to investigate the pea pod texture trait, involving constricted or 
inflated pods. Mendel found that the traits behave according to a 
dominant/recessive pattern in which inflated pods were dominant. If you 
performed this cross and obtained 650 inflated-pod plants in the Fy generation, 
approximately how many constricted-pod plants would you expect to have? 


a. 600 
b. 165 
e217 
d. 468 


Solution: 


G 


Free Response 


Exercise: 


Problem: 


Describe one of the reasons why the garden pea was an excellent choice of model 
system for studying inheritance. 


Solution: 


The garden pea is sessile and has flowers that close tightly during self-pollination. 
These features help to prevent accidental or unintentional fertilizations that could 
have diminished the accuracy of Mendel’s data. 


Exercise: 


Problem: 


How would you perform a reciprocal cross for the characteristic of stem height in 
the garden pea? 


Solution: 


Two sets of Po parents would be used. In the first cross, pollen would be 
transferred from a true-breeding tall plant to the stigma of a true-breeding dwarf 
plant. In the second cross, pollen would be transferred from a true-breeding dwarf 
plant to the stigma of a true-breeding tall plant. For each cross, F, and F> offspring 
would be analyzed to determine if offspring traits were affected according to 
which parent donated each trait. 


Glossary 


blending theory of inheritance 
hypothetical inheritance pattern in which parental traits are blended together in the 
offspring to produce an intermediate physical appearance 


continuous variation 
inheritance pattern in which a character shows a range of trait values with small 
gradations rather than large gaps between them 


discontinuous variation 
inheritance pattern in which traits are distinct and are transmitted independently of 
one another 


dominant 
trait which confers the same physical appearance whether an individual has two 
copies of the trait or one copy of the dominant trait and one copy of the recessive 
trait 


Fy 
first filial generation in a cross; the offspring of the parental generation 


F) 
second filial generation produced when F;, individuals are self-crossed or fertilized 
with each other 


hybridization 
process of mating two individuals that differ with the goal of achieving a certain 
characteristic in their offspring 


model system 
species or biological system used to study a specific biological phenomenon to be 
applied to other different species 


Po 
parental generation in a cross 


product rule 
probability of two independent events occurring simultaneously can be calculated 
by multiplying the individual probabilities of each event occurring alone 


recessive 
trait that appears “latent” or non-expressed when the individual also carries a 
dominant trait for that same characteristic; when present as two identical copies, 
the recessive trait is expressed 


reciprocal cross 
paired cross in which the respective traits of the male and female in one cross 
become the respective traits of the female and male in the other cross 


sum rule 
probability of the occurrence of at least one of two mutually exclusive events is 
the sum of their individual probabilities 


trait 
variation in the physical appearance of a heritable characteristic 


Characteristics and Traits 
By the end of this section, you will be able to: 


e Explain the relationship between genotypes and phenotypes in 
dominant and recessive gene systems 

¢ Develop a Punnett square to calculate the expected proportions of 
genotypes and phenotypes in a monohybrid cross 

e Explain the purpose and methods of a test cross 

e Identify non-Mendelian inheritance patterns such as incomplete 
dominance, codominance, recessive lethals, multiple alleles, and sex 
linkage 


The seven characteristics that Mendel evaluated in his pea plants were each 
expressed as one of two versions, or traits. The physical expression of 
characteristics is accomplished through the expression of genes carried on 
chromosomes. The genetic makeup of peas consists of two similar or 
homologous copies of each chromosome, one from each parent. Each pair 
of homologous chromosomes has the same linear order of genes. In other 
words, peas are diploid organisms in that they have two copies of each 
chromosome. The same is true for many other plants and for virtually all 
animals. Diploid organisms utilize meiosis to produce haploid gametes, 
which contain one copy of each homologous chromosome that unite at 
fertilization to create a diploid zygote. 


For cases in which a single gene controls a single characteristic, a diploid 
organism has two genetic copies that may or may not encode the same 
version of that characteristic. Gene variants that arise by mutation and exist 
at the same relative locations on homologous chromosomes are called 
alleles. Mendel examined the inheritance of genes with just two allele 
forms, but it is common to encounter more than two alleles for any given 
gene in a natural population. 


Phenotypes and Genotypes 


Two alleles for a given gene in a diploid organism are expressed and 
interact to produce physical characteristics. The observable traits expressed 
by an organism are referred to as its phenotype. An organism’s underlying 


genetic makeup, consisting of both physically visible and non-expressed 
alleles, is called its genotype. Mendel’s hybridization experiments 
demonstrate the difference between phenotype and genotype. When true- 
breeding plants in which one parent had yellow pods and one had green 
pods were cross-fertilized, all of the F, hybrid offspring had yellow pods. 
That is, the hybrid offspring were phenotypically identical to the true- 
breeding parent with yellow pods. However, we know that the allele 
donated by the parent with green pods was not simply lost because it 
reappeared in some of the F> offspring. Therefore, the F; plants must have 
been genotypically different from the parent with yellow pods. 


The P,; plants that Mendel used in his experiments were each homozygous 
for the trait he was studying. Diploid organisms that are homozygous at a 
given gene, or locus, have two identical alleles for that gene on their 
homologous chromosomes. Mendel’s parental pea plants always bred true 
because both of the gametes produced carried the same trait. When P, 
plants with contrasting traits were cross-fertilized, all of the offspring were 
heterozygous for the contrasting trait, meaning that their genotype reflected 
that they had different alleles for the gene being examined. 


Dominant and Recessive Alleles 


Our discussion of homozygous and heterozygous organisms brings us to 
why the F, heterozygous offspring were identical to one of the parents, 
rather than expressing both alleles. In all seven pea-plant characteristics, 
one of the two contrasting alleles was dominant, and the other was 
recessive. Mendel called the dominant allele the expressed unit factor; the 
recessive allele was referred to as the latent unit factor. We now know that 
these so-called unit factors are actually genes on homologous chromosome 
pairs. For a gene that is expressed in a dominant and recessive pattern, 
homozygous dominant and heterozygous organisms will look identical (that 
is, they will have different genotypes but the same phenotype). The 
recessive allele will only be observed in homozygous recessive individuals 
({link]). 


Human Inheritance in Dominant and Recessive Patterns 


Dominant Traits Recessive Traits 
Achondroplasia Albinism 

Brachydactyly Cystic fibrosis 

Huntington’s disease Duchenne muscular dystrophy 
Marfan syndrome Galactosemia 
Neurofibromatosis Phenylketonuria 

Widow’s peak Sickle-cell anemia 

Wooly hair Tay-Sachs disease 


Several conventions exist for referring to genes and alleles. For the 
purposes of this chapter, we will abbreviate genes using the first letter of the 
gene’s corresponding dominant trait. For example, violet is the dominant 
trait for a pea plant’s flower color, so the flower-color gene would be 
abbreviated as V (note that it is customary to italicize gene designations). 
Furthermore, we will use uppercase and lowercase letters to represent 
dominant and recessive alleles, respectively. Therefore, we would refer to 
the genotype of a homozygous dominant pea plant with violet flowers as 
VV, a homozygous recessive pea plant with white flowers as vv, anda 
heterozygous pea plant with violet flowers as Vv. 


The Punnett Square Approach for a Monohybrid Cross 


When fertilization occurs between two true-breeding parents that differ in 
only one characteristic, the process is called a monohybrid cross, and the 
resulting offspring are monohybrids. Mendel performed seven monohybrid 
crosses involving contrasting traits for each characteristic. On the basis of 
his results in F; and F5 generations, Mendel postulated that each parent in 


the monohybrid cross contributed one of two paired unit factors to each 
offspring, and every possible combination of unit factors was equally likely. 


To demonstrate a monohybrid cross, consider the case of true-breeding pea 
plants with yellow versus green pea seeds. The dominant seed color is 
yellow; therefore, the parental genotypes were YY for the plants with yellow 
seeds and yy for the plants with green seeds, respectively. A Punnett 
square, devised by the British geneticist Reginald Punnett, can be drawn 
that applies the rules of probability to predict the possible outcomes of a 
genetic cross or mating and their expected frequencies. To prepare a Punnett 
square, all possible combinations of the parental alleles are listed along the 
top (for one parent) and side (for the other parent) of a grid, representing 
their meiotic segregation into haploid gametes. Then the combinations of 
egg and sperm are made in the boxes in the table to show which alleles are 
combining. Each box then represents the diploid genotype of a zygote, or 
fertilized egg, that could result from this mating. Because each possibility is 
equally likely, genotypic ratios can be determined from a Punnett square. If 
the pattern of inheritance (dominant or recessive) is known, the phenotypic 
ratios can be inferred as well. For a monohybrid cross of two true-breeding 
parents, each parent contributes one type of allele. In this case, only one 
genotype is possible. All offspring are Yy and have yellow seeds ({link]). 


Genotype Phenotype 
Phenotypes Genotypes ratio ratio 


In the P generation, pea plants that 
are true-breeding for the dominant 
yellow phenotype are crossed with 
plants with the recessive green 
phenotype. This cross produces F; 
heterozygotes with a yellow 
phenotype. Punnett square 
analysis can be used to predict the 
genotypes of the F> generation. 


A self-cross of one of the Yy heterozygous offspring can be represented in a 
2 x 2 Punnett square because each parent can donate one of two different 
alleles. Therefore, the offspring can potentially have one of four allele 


combinations: YY, Yy, yY, or yy ({link]). Notice that there are two ways to 
obtain the Yy genotype: a Y from the egg and a y from the sperm, or a y 
from the egg and a Y from the sperm. Both of these possibilities must be 
counted. Recall that Mendel’s pea-plant characteristics behaved in the same 
way in reciprocal crosses. Therefore, the two possible heterozygous 
combinations produce offspring that are genotypically and phenotypically 
identical despite their dominant and recessive alleles deriving from different 
parents. They are grouped together. Because fertilization is a random event, 
we expect each combination to be equally likely and for the offspring to 
exhibit a ratio of YY:Yy:yy genotypes of 1:2:1 ({link]). Furthermore, because 
the YY and Yy offspring have yellow seeds and are phenotypically identical, 
applying the sum rule of probability, we expect the offspring to exhibit a 
phenotypic ratio of 3 yellow:1 green. Indeed, working with large sample 
sizes, Mendel observed approximately this ratio in every F> generation 
resulting from crosses for individual traits. 


Mendel validated these results by performing an F3 cross in which he self- 
crossed the dominant- and recessive-expressing F5 plants. When he self- 
crossed the plants expressing green seeds, all of the offspring had green 
seeds, confirming that all green seeds had homozygous genotypes of yy. 
When he self-crossed the F> plants expressing yellow seeds, he found that 
one-third of the plants bred true, and two-thirds of the plants segregated at a 
3:1 ratio of yellow:green seeds. In this case, the true-breeding plants had 
homozygous (YY) genotypes, whereas the segregating plants corresponded 
to the heterozygous (Yy) genotype. When these plants self-fertilized, the 
outcome was just like the F, self-fertilizing cross. 


The Test Cross Distinguishes the Dominant Phenotype 


Beyond predicting the offspring of a cross between known homozygous or 
heterozygous parents, Mendel also developed a way to determine whether 
an organism that expressed a dominant trait was a heterozygote or a 
homozygote. Called the test cross, this technique is still used by plant and 
animal breeders. In a test cross, the dominant-expressing organism is 
crossed with an organism that is homozygous recessive for the same 
characteristic. If the dominant-expressing organism is a homozygote, then 


all F, offspring will be heterozygotes expressing the dominant trait ({link]). 
Alternatively, if the dominant expressing organism is a heterozygote, the F, 
offspring will exhibit a 1:1 ratio of heterozygotes and recessive 
homozygotes ([link]). The test cross further validates Mendel’s postulate 
that pairs of unit factors segregate equally. 


Note: 
Art Connection 


A test cross can be performed to 
determine whether an organism 
expressing a dominant trait is a 
homozygote or a heterozygote. 


In pea plants, round peas (R) are dominant to wrinkled peas (r). You do a 
test cross between a pea plant with wrinkled peas (genotype rr) and a plant 
of unknown genotype that has round peas. You end up with three plants, all 


which have round peas. From this data, can you tell if the round pea parent 
plant is homozygous dominant or heterozygous? If the round pea parent 
plant is heterozygous, what is the probability that a random sample of 3 
progeny peas will all be round? 


Many human diseases are genetically inherited. A healthy person in a 
family in which some members suffer from a recessive genetic disorder 
may want to know if he or she has the disease-causing gene and what risk 
exists of passing the disorder on to his or her offspring. Of course, doing a 
test cross in humans is unethical and impractical. Instead, geneticists use 
pedigree analysis to study the inheritance pattern of human genetic 
diseases ({link]). 


Note: 


Art Connection 
Pedigree Analysis for Alkaptonuria 


First 
generation 


Second 
generation 


Third 
generation 


Fourth 
generation 


| Male O Female im Unaffected im Affected 


Alkaptonuria is a recessive genetic 
disorder in which two amino 
acids, phenylalanine and tyrosine, 
are not properly metabolized. 
Affected individuals may have 


darkened skin and brown urine, 
and may suffer joint damage and 
other complications. In this 
pedigree, individuals with the 
disorder are indicated in blue and 
have the genotype aa. Unaffected 
individuals are indicated in yellow 
and have the genotype AA or Aa. 
Note that it is often possible to 
determine a person’s genotype 
from the genotype of their 
offspring. For example, if neither 
parent has the disorder but their 
child does, they must be 
heterozygous. Two individuals on 
the pedigree have an unaffected 
phenotype but unknown genotype. 
Because they do not have the 
disorder, they must have at least 
one normal allele, so their 
genotype gets the “A?” 
designation. 


What are the genotypes of the individuals labeled 1, 2 and 3? 


Alternatives to Dominance and Recessiveness 


Mendel’s experiments with pea plants suggested that: (1) two “units” or 
alleles exist for every gene; (2) alleles maintain their integrity in each 
generation (no blending); and (3) in the presence of the dominant allele, the 
recessive allele is hidden and makes no contribution to the phenotype. 
Therefore, recessive alleles can be “carried” and not expressed by 
individuals. Such heterozygous individuals are sometimes referred to as 
“carriers.” Further genetic studies in other plants and animals have shown 


that much more complexity exists, but that the fundamental principles of 
Mendelian genetics still hold true. In the sections to follow, we consider 
some of the extensions of Mendelism. If Mendel had chosen an 
experimental system that exhibited these genetic complexities, it’s possible 
that he would not have understood what his results meant. 


Incomplete Dominance 


Mendel’s results, that traits are inherited as dominant and recessive pairs, 
contradicted the view at that time that offspring exhibited a blend of their 
parents’ traits. However, the heterozygote phenotype occasionally does 
appear to be intermediate between the two parents. For example, in the 
snapdragon, Antirrhinum majus ({link]), a cross between a homozygous 
parent with white flowers (C“C™) and a homozygous parent with red 
flowers (CRC®) will produce offspring with pink flowers (C®C™). (Note that 
different genotypic abbreviations are used for Mendelian extensions to 
distinguish these patterns from simple dominance and recessiveness.) This 
pattern of inheritance is described as incomplete dominance, denoting the 
expression of two contrasting alleles such that the individual displays an 
intermediate phenotype. The allele for red flowers is incompletely dominant 
over the allele for white flowers. However, the results of a heterozygote 
self-cross can still be predicted, just as with Mendelian dominant and 
recessive crosses. In this case, the genotypic ratio would be 1 C®C®:2 
CRCW:1 CYC, and the phenotypic ratio would be 1:2:1 for red:pink:white. 


These pink flowers of a 
heterozygote snapdragon result 
from incomplete dominance. 
(credit: 
“storebukkebruse”/Flickr) 


Codominance 


A variation on incomplete dominance is codominance, in which both 
alleles for the same characteristic are simultaneously expressed in the 
heterozygote. An example of codominance is the MN blood groups of 
humans. The M and N alleles are expressed in the form of an M or N 
antigen present on the surface of red blood cells. Homozygotes (LML™ and 
LNLN) express either the M or the N allele, and heterozygotes (LML) 
express both alleles equally. In a self-cross between heterozygotes 


expressing a codominant trait, the three possible offspring genotypes are 
phenotypically distinct. However, the 1:2:1 genotypic ratio characteristic of 
a Mendelian monohybrid cross still applies. 


Multiple Alleles 


Mendel implied that only two alleles, one dominant and one recessive, 
could exist for a given gene. We now know that this is an 
oversimplification. Although individual humans (and all diploid organisms) 
can only have two alleles for a given gene, multiple alleles may exist at the 
population level such that many combinations of two alleles are observed. 
Note that when many alleles exist for the same gene, the convention is to 
denote the most common phenotype or genotype among wild animals as the 
wild type (often abbreviated “+”); this is considered the standard or norm. 
All other phenotypes or genotypes are considered variants of this standard, 
meaning that they deviate from the wild type. The variant may be recessive 
or dominant to the wild-type allele. 


An example of multiple alleles is coat color in rabbits ([link]). Here, four 
alleles exist for the c gene. The wild-type version, C*C™, is expressed as 
brown fur. The chinchilla phenotype, c“'c“", is expressed as black-tipped 
white fur. The Himalayan phenotype, c’c", has black fur on the extremities 
and white fur elsewhere. Finally, the albino, or “colorless” phenotype, cc, is 
expressed as white fur. In cases of multiple alleles, dominance hierarchies 
can exist. In this case, the wild-type allele is dominant over all the others, 
chinchilla is incompletely dominant over Himalayan and albino, and 
Himalayan is dominant over albino. This hierarchy, or allelic series, was 
revealed by observing the phenotypes of each possible heterozygote 
offspring. 


Allele 


Genotype 


Phenotype 


Four different alleles exist for the rabbit coat color 
(C) gene. 


The complete dominance of a wild-type phenotype over all other mutants 
often occurs as an effect of “dosage” of a specific gene product, such that 
the wild-type allele supplies the correct amount of gene product whereas the 
mutant alleles cannot. For the allelic series in rabbits, the wild-type allele 
may supply a given dosage of fur pigment, whereas the mutants supply a 
lesser dosage or none at all. Interestingly, the Himalayan phenotype is the 
result of an allele that produces a temperature-sensitive gene product that 
only produces pigment in the cooler extremities of the rabbit’s body. 


Alternatively, one mutant allele can be dominant over all other phenotypes, 
including the wild type. This may occur when the mutant allele somehow 
interferes with the genetic message so that even a heterozygote with one 
wild-type allele copy expresses the mutant phenotype. One way in which 
the mutant allele can interfere is by enhancing the function of the wild-type 
gene product or changing its distribution in the body. One example of this is 
the Antennapedia mutation in Drosophila ({link]). In this case, the mutant 
allele expands the distribution of the gene product, and as a result, the 


Antennapedia heterozygote develops legs on its head where its antennae 
should be. 


As seen in comparing the 
wild-type Drosophila 
(left) and the 
Antennapedia mutant 
(right), the Antennapedia 
mutant has legs on its 
head in place of antennae. 


Note: 

Evolution Connection 

Multiple Alleles Confer Drug Resistance in the Malaria Parasite 
Malaria is a parasitic disease in humans that is transmitted by infected 
female mosquitoes, including Anopheles gambiae ({link]a), and is 
characterized by cyclic high fevers, chills, flu-like symptoms, and severe 
anemia. Plasmodium falciparum and P. vivax are the most common 
causative agents of malaria, and P. falciparum is the most deadly ({link]b). 
When promptly and correctly treated, P. falciparum malaria has a mortality 


rate of 0.1 percent. However, in some parts of the world, the parasite has 
evolved resistance to commonly used malaria treatments, so the most 
effective malarial treatments can vary by geographic region. 


The (a) Anopheles gambiae, or African malaria 
mosquito, acts as a vector in the transmission to 
humans of the malaria-causing parasite (b) 
Plasmodium falciparum, here visualized using 
false-color transmission electron microscopy. 
(credit a: James D. Gathany; credit b: Ute 
Frevert; false color by Margaret Shear; scale-bar 
data from Matt Russell) 


In Southeast Asia, Africa, and South America, P. falciparum has developed 
resistance to the anti-malarial drugs chloroquine, mefloquine, and 
sulfadoxine-pyrimethamine. P. falciparum, which is haploid during the life 
stage in which it is infectious to humans, has evolved multiple drug- 
resistant mutant alleles of the dhps gene. Varying degrees of sulfadoxine 
resistance are associated with each of these alleles. Being haploid, P. 
falciparum needs only one drug-resistant allele to express this trait. 

In Southeast Asia, different sulfadoxine-resistant alleles of the dhps gene 
are localized to different geographic regions. This is a common 
evolutionary phenomenon that occurs because drug-resistant mutants arise 
in a population and interbreed with other P. falciparum isolates in close 
proximity. Sulfadoxine-resistant parasites cause considerable human 
hardship in regions where this drug is widely used as an over-the-counter 
malaria remedy. As is common with pathogens that multiply to large 


numbers within an infection cycle, P. falciparum evolves relatively rapidly 
(over a decade or so) in response to the selective pressure of commonly 
used anti-malarial drugs. For this reason, scientists must constantly work to 
develop new drugs or drug combinations to combat the worldwide malaria 
burden. eomote) 

Sumiti Vinayak, et al., “Origin and Evolution of Sulfadoxine Resistant 
Plasmodium falciparum,” Public Library of Science Pathogens 6, no. 3 
(2010): e1000830, doi:10.1371/journal.ppat. 1000830. 


X-Linked Traits 


In humans, as well as in many other animals and some plants, the sex of the 
individual is determined by sex chromosomes. The sex chromosomes are 
one pair of non-homologous chromosomes. Until now, we have only 
considered inheritance patterns among non-sex chromosomes, or 
autosomes. In addition to 22 homologous pairs of autosomes, human 
females have a homologous pair of X chromosomes, whereas human males 
have an XY chromosome pair. Although the Y chromosome contains a 
small region of similarity to the X chromosome so that they can pair during 
meiosis, the Y chromosome is much shorter and contains many fewer 
genes. When a gene being examined is present on the X chromosome, but 
not on the Y chromosome, it is said to be X-linked. 


Eye color in Drosophila was one of the first X-linked traits to be identified. 
Thomas Hunt Morgan mapped this trait to the X chromosome in 1910. Like 
humans, Drosophila males have an XY chromosome pair, and females are 
XX. In flies, the wild-type eye color is red (X”) and it is dominant to white 
eye color (X”) ([link]). Because of the location of the eye-color gene, 
reciprocal crosses do not produce the same offspring ratios. Males are said 
to be hemizygous, because they have only one allele for any X-linked 
characteristic. Hemizygosity makes the descriptions of dominance and 
recessiveness irrelevant for XY males. Drosophila males lack a second 
allele copy on the Y chromosome; that is, their genotype can only be X”Y 
or XY. In contrast, females have two allele copies of this gene and can be 
KK KY Or KKM, 


In Drosophila, several 
genes determine eye 
color. The genes for 
white and vermilion 

eye colors are located 

on the X 
chromosome. Others 
are located on the 
autosomes. Clockwise 
from top left are 
brown, cinnabar, 
sepia, vermilion, 
white, and red. Red 
eye color is wild-type 
and is dominant to 
white eye color. 


In an X-linked cross, the genotypes of F, and F> offspring depend on 
whether the recessive trait was expressed by the male or the female in the 
P, generation. With regard to Drosophila eye color, when the P; male 
expresses the white-eye phenotype and the female is homozygous red-eyed, 
all members of the F, generation exhibit red eyes ({link]). The F, females 


are heterozygous (XX), and the males are all XY, having received their 
X chromosome from the homozygous dominant P; female and their Y 
chromosome from the P,; male. A subsequent cross between the X“XW 
female and the X“Y male would produce only red-eyed females (with 
XX or XX” genotypes) and both red- and white-eyed males (with X“Y 
or XY genotypes). Now, consider a cross between a homozygous white- 
eyed female and a male with red eyes. The F; generation would exhibit only 
heterozygous red-eyed females (XX) and only white-eyed males (X”’Y). 
Half of the F, females would be red-eyed (X“X™) and half would be white- 
eyed (X”X”). Similarly, half of the F7 males would be red-eyed (XY) and 
half would be white-eyed (XY). 


Note: 
Art Connection 


Punnett Square Analysis of a Sex-linked Trait 


All female offspring 
have red eyes. 


All male offspring 
have white eyes. 


Punnett square analysis is used to determine 
the ratio of offspring from a cross between a 


red-eyed male fruit fly and a white-eyed 
female fruit fly. 


What ratio of offspring would result from a cross between a white-eyed 
male and a female that is heterozygous for red eye color? 


Discoveries in fruit fly genetics can be applied to human genetics. When a 
female parent is homozygous for a recessive X-linked trait, she will pass the 
trait on to 100 percent of her offspring. Her male offspring are, therefore, 
destined to express the trait, as they will inherit their father's Y 
chromosome. In humans, the alleles for certain conditions (some forms of 
color blindness, hemophilia, and muscular dystrophy) are X-linked. 
Females who are heterozygous for these diseases are said to be carriers and 
may not exhibit any phenotypic effects. These females will pass the disease 
to half of their sons and will pass carrier status to half of their daughters; 
therefore, recessive X-linked traits appear more frequently in males than 
females. 


In some groups of organisms with sex chromosomes, the sex with the non- 
homologous sex chromosomes is the female rather than the male. This is 
the case for all birds. In this case, sex-linked traits will be more likely to 
appear in the female, in which they are hemizygous. 


Human Sex-linked Disorders 


Sex-linkage studies in Morgan’s laboratory provided the fundamentals for 
understanding X-linked recessive disorders in humans, which include red- 
green color blindness, and Types A and B hemophilia. Because human 
males need to inherit only one recessive mutant X allele to be affected, X- 
linked disorders are disproportionately observed in males. Females must 
inherit recessive X-linked alleles from both of their parents in order to 
express the trait. When they inherit one recessive X-linked mutant allele 
and one dominant X-linked wild-type allele, they are carriers of the trait and 
are typically unaffected. Carrier females can manifest mild forms of the trait 


due to the inactivation of the dominant allele located on one of the X 
chromosomes. However, female carriers can contribute the trait to their 
sons, resulting in the son exhibiting the trait, or they can contribute the 
recessive allele to their daughters, resulting in the daughters being carriers 
of the trait ([link]). Although some Y-linked recessive disorders exist, 
typically they are associated with infertility in males and are therefore not 
transmitted to subsequent generations. 


Unaffected father Unaffected, 
carrier mother 


Dominant —@ f8—— X-linked, 
allele recessive allele 


XX 
[Affected 


[J Unaffected 


Unaffected Unaffected Affected Unaffected 
son daughter son carrier daughter 


The son of a woman who is a carrier 
of a recessive X-linked disorder will 
have a 50 percent chance of being 
affected. A daughter will not be 
affected, but she will have a 50 
percent chance of being a carrier like 
her mother. 


Note: 
Link to Learning 
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Watch this video to learn more about sex-linked traits. 
https://www.openstaxcollege.org/I/sex-linked trts 


Lethality 


A large proportion of genes in an individual’s genome are essential for 
survival. Occasionally, a nonfunctional allele for an essential gene can arise 
by mutation and be transmitted in a population as long as individuals with 
this allele also have a wild-type, functional copy. The wild-type allele 
functions at a capacity sufficient to sustain life and is therefore considered 
to be dominant over the nonfunctional allele. However, consider two 
heterozygous parents that have a genotype of wild-type/nonfunctional 
mutant for a hypothetical essential gene. In one quarter of their offspring, 
we would expect to observe individuals that are homozygous recessive for 
the nonfunctional allele. Because the gene is essential, these individuals 
might fail to develop past fertilization, die in utero, or die later in life, 
depending on what life stage requires this gene. An inheritance pattern in 
which an allele is only lethal in the homozygous form and in which the 
heterozygote may be normal or have some altered non-lethal phenotype is 
referred to as recessive lethal. 


For crosses between heterozygous individuals with a recessive lethal allele 
that causes death before birth when homozygous, only wild-type 
homozygotes and heterozygotes would be observed. The genotypic ratio 
would therefore be 2:1. In other instances, the recessive lethal allele might 
also exhibit a dominant (but not lethal) phenotype in the heterozygote. For 


instance, the recessive lethal Curly allele in Drosophila affects wing shape 
in the heterozygote form but is lethal in the homozygote. 


A single copy of the wild-type allele is not always sufficient for normal 
functioning or even survival. The dominant lethal inheritance pattern is 
one in which an allele is lethal both in the homozygote and the 
heterozygote; this allele can only be transmitted if the lethality phenotype 
occurs after reproductive age. Individuals with mutations that result in 
dominant lethal alleles fail to survive even in the heterozygote form. 
Dominant lethal alleles are very rare because, as you might expect, the 
allele only lasts one generation and is not transmitted. However, just as the 
recessive lethal allele might not immediately manifest the phenotype of 
death, dominant lethal alleles also might not be expressed until adulthood. 
Once the individual reaches reproductive age, the allele may be 
unknowingly passed on, resulting in a delayed death in both generations. 
An example of this in humans is Huntington’s disease, in which the nervous 
system gradually wastes away ([link]). People who are heterozygous for the 
dominant Huntington allele (Hh) will inevitably develop the fatal disease. 
However, the onset of Huntington’s disease may not occur until age 40, at 
which point the afflicted persons may have already passed the allele to 50 
percent of their offspring. 


The neuron in the center of 
this micrograph (yellow) has 
nuclear inclusions 
characteristic of Huntington’s 
disease (orange area in the 
center of the neuron). 
Huntington’s disease occurs 
when an abnormal dominant 
allele for the Huntington gene 
is present. (credit: Dr. Steven 
Finkbeiner, Gladstone 
Institute of Neurological 
Disease, The Taube-Koret 
Center for Huntington's 
Disease Research, and the 
University of California San 

Francisco/Wikimedia) 


Section Summary 


When true-breeding or homozygous individuals that differ for a certain trait 
are crossed, all of the offspring will be heterozygotes for that trait. If the 
traits are inherited as dominant and recessive, the F, offspring will all 
exhibit the same phenotype as the parent homozygous for the dominant 
trait. If these heterozygous offspring are self-crossed, the resulting F>5 
offspring will be equally likely to inherit gametes carrying the dominant or 
recessive trait, giving rise to offspring of which one quarter are 
homozygous dominant, half are heterozygous, and one quarter are 
homozygous recessive. Because homozygous dominant and heterozygous 
individuals are phenotypically identical, the observed traits in the F> 
offspring will exhibit a ratio of three dominant to one recessive. 


Alleles do not always behave in dominant and recessive patterns. 
Incomplete dominance describes situations in which the heterozygote 
exhibits a phenotype that is intermediate between the homozygous 
phenotypes. Codominance describes the simultaneous expression of both of 
the alleles in the heterozygote. Although diploid organisms can only have 
two alleles for any given gene, it is common for more than two alleles of a 
gene to exist in a population. In humans, as in many animals and some 
plants, females have two X chromosomes and males have one X and one Y 
chromosome. Genes that are present on the X but not the Y chromosome 
are said to be X-linked, such that males only inherit one allele for the gene, 
and females inherit two. Finally, some alleles can be lethal. Recessive lethal 
alleles are only lethal in homozygotes, but dominant lethal alleles are fatal 
in heterozygotes as well. 


Art Connections 


Exercise: 


Problem: 


[link] In pea plants, round peas (R) are dominant to wrinkled peas (r). 
You do a test cross between a pea plant with wrinkled peas (genotype 
rr) and a plant of unknown genotype that has round peas. You end up 
with three plants, all which have round peas. From this data, can you 
tell if the round pea parent plant is homozygous dominant or 
heterozygous? If the round pea parent plant is heterozygous, what is 
the probability that a random sample of 3 progeny peas will all be 
round? 


Solution: 


[link] You cannot be sure if the plant is homozygous or heterozygous 
as the data set is too small: by random chance, all three plants might 
have acquired only the dominant gene even if the recessive one is 
present. If the round pea parent is heterozygous, there is a one-eighth 
probability that a random sample of three progeny peas will all be 
round. 


Exercise: 


Problem: 
[link] What are the genotypes of the individuals labeled 1, 2 and 3? 
Solution: 


[link] Individual 1 has the genotype aa. Individual 2 has the genotype 
Aa. Individual 3 has the genotype Aa. 

Exercise: 
Problem: 


[link] What ratio of offspring would result from a cross between a 
white-eyed male and a female that is heterozygous for red eye color? 


Solution: 


[link] Half of the female offspring would be heterozygous (XX) 
with red eyes, and half would be homozygous recessive (X“X™) with 
white eyes. Half of the male offspring would be hemizygous dominant 
(XY) withe red yes, and half would be hemizygous recessive (XY) 
with white eyes. 


Review Questions 


Exercise: 


Problem: 


The observable traits expressed by an organism are described as its 


a. phenotype 
b. genotype 
c. alleles 

d. zygote 


Solution: 


A 
Exercise: 
Problem: 


A recessive trait will be observed in individuals that are for 
that trait. 


a. heterozygous 

b. homozygous or heterozygous 
c. homozygous 

d. diploid 


Solution: 


C 
Exercise: 


Problem: 


If black and white true-breeding mice are mated and the result is all 
gray offspring, what inheritance pattern would this be indicative of? 


a. dominance 

b. codominance 

c. multiple alleles 

d. incomplete dominance 


Solution: 


D 
Exercise: 


Problem: 


The ABO blood groups in humans are expressed as the I4, [?, and i 
alleles. The I“ allele encodes the A blood group antigen, I® encodes B, 
and i encodes O. Both A and B are dominant to O. If a heterozygous 
blood type A parent (Ii) and a heterozygous blood type B parent (Ii) 
mate, one quarter of their offspring will have AB blood type (IF*) in 
which both antigens are expressed equally. Therefore, ABO blood 
groups are an example of: 


a. multiple alleles and incomplete dominance 
b. codominance and incomplete dominance 
c. incomplete dominance only 

d. multiple alleles and codominance 


Solution: 


D 


Exercise: 


Problem: 


In a mating between two individuals that are heterozygous for a 
recessive lethal allele that is expressed in utero, what genotypic ratio 
(homozygous dominant:heterozygous:homozygous recessive) would 
you expect to observe in the offspring? 


a. 1:2:1 
bs 37141 
Col) 
aU2271. 


Solution: 


C 


Free Response 


Exercise: 


Problem: 


The gene for flower position in pea plants exists as axial or terminal 
alleles. Given that axial is dominant to terminal, list all of the possible 
F, and F, genotypes and phenotypes from a cross involving parents 
that are homozygous for each trait. Express genotypes with 
conventional genetic abbreviations. 


Solution: 


Because axial is dominant, the gene would be designated as A. F; 
would be all heterozygous Aa with axial phenotype. F) would have 
possible genotypes of AA, Aa, and aa; these would correspond to axial, 
axial, and terminal phenotypes, respectively. 


Exercise: 


Problem: 


Use a Punnett square to predict the offspring in a cross between a 
dwarf pea plant (homozygous recessive) and a tall pea plant 
(heterozygous). What is the phenotypic ratio of the offspring? 


Solution: 


The Punnett square would be 2 x 2 and will have T and T along the 
top, and T and t along the left side. Clockwise from the top left, the 
genotypes listed within the boxes will be Tt, Tt, tt, and tt. The 
phenotypic ratio will be 1 tall:1 dwarf. 


Exercise: 
Problem:Can a human male be a carrier of red-green color blindness? 
Solution: 
No, males can only express color blindness. They cannot carry it 
because an individual needs two X chromosomes to be a carrier. 
Glossary 


allele 
gene variations that arise by mutation and exist at the same relative 
locations on homologous chromosomes 


autosomes 
any of the non-sex chromosomes 


codominance 
in a heterozygote, complete and simultaneous expression of both 
alleles for the same characteristic 


dominant lethal 


inheritance pattern in which an allele is lethal both in the homozygote 
and the heterozygote; this allele can only be transmitted if the lethality 
phenotype occurs after reproductive age 


genotype 
underlying genetic makeup, consisting of both physically visible and 
non-expressed alleles, of an organism 


hemizygous 
presence of only one allele for a characteristic, as in X-linkage; 
hemizygosity makes descriptions of dominance and recessiveness 
irrelevant 


heterozygous 
having two different alleles for a given gene on the homologous 
chromosome 


homozygous 
having two identical alleles for a given gene on the homologous 
chromosome 


incomplete dominance 
in a heterozygote, expression of two contrasting alleles such that the 
individual displays an intermediate phenotype 


monohybrid 
result of a cross between two true-breeding parents that express 
different traits for only one characteristic 


phenotype 
observable traits expressed by an organism 


Punnett square 
visual representation of a cross between two individuals in which the 
gametes of each individual are denoted along the top and side of a 
grid, respectively, and the possible zygotic genotypes are recombined 
at each box in the grid 


recessive lethal 
inheritance pattern in which an allele is only lethal in the homozygous 
form; the heterozygote may be normal or have some altered, non-lethal 
phenotype 


sex-linked 
any gene on a sex chromosome 


test cross 
cross between a dominant expressing individual with an unknown 
genotype and a homozygous recessive individual; the offspring 
phenotypes indicate whether the unknown parent is heterozygous or 
homozygous for the dominant trait 


X-linked 
gene present on the X, but not the Y chromosome 


Laws of Inheritance 
By the end of this section, you will be able to: 


e Explain Mendel’s law of segregation and independent assortment in 
terms of genetics and the events of meiosis 

e Use the forked-line method and the probability rules to calculate the 
probability of genotypes and phenotypes from multiple gene crosses 

e Explain the effect of linkage and recombination on gamete genotypes 

e Explain the phenotypic outcomes of epistatic effects between genes 


Mendel generalized the results of his pea-plant experiments into four 
postulates, some of which are sometimes called “laws,” that describe the 
basis of dominant and recessive inheritance in diploid organisms. As you 
have learned, more complex extensions of Mendelism exist that do not 
exhibit the same F, phenotypic ratios (3:1). Nevertheless, these laws 
summarize the basics of classical genetics. 


Pairs of Unit Factors, or Genes 


Mendel proposed first that paired unit factors of heredity were transmitted 
faithfully from generation to generation by the dissociation and 
reassociation of paired factors during gametogenesis and fertilization, 
respectively. After he crossed peas with contrasting traits and found that the 
recessive trait resurfaced in the F> generation, Mendel deduced that 
hereditary factors must be inherited as discrete units. This finding 
contradicted the belief at that time that parental traits were blended in the 
offspring. 


Alleles Can Be Dominant or Recessive 


Mendel’s law of dominance states that in a heterozygote, one trait will 
conceal the presence of another trait for the same characteristic. Rather than 
both alleles contributing to a phenotype, the dominant allele will be 
expressed exclusively. The recessive allele will remain “latent” but will be 
transmitted to offspring by the same manner in which the dominant allele is 
transmitted. The recessive trait will only be expressed by offspring that 


have two copies of this allele ({link]), and these offspring will breed true 
when self-crossed. 


Since Mendel’s experiments with pea plants, other researchers have found 
that the law of dominance does not always hold true. Instead, several 
different patterns of inheritance have been found to exist. 


The child in the photo 
expresses albinism, a 
recessive trait. 


Equal Segregation of Alleles 


Observing that true-breeding pea plants with contrasting traits gave rise to 
F, generations that all expressed the dominant trait and F) generations that 
expressed the dominant and recessive traits in a 3:1 ratio, Mendel proposed 
the law of segregation. This law states that paired unit factors (genes) must 
segregate equally into gametes such that offspring have an equal likelihood 


of inheriting either factor. For the Fy generation of a monohybrid cross, the 
following three possible combinations of genotypes could result: 
homozygous dominant, heterozygous, or homozygous recessive. Because 
heterozygotes could arise from two different pathways (receiving one 
dominant and one recessive allele from either parent), and because 
heterozygotes and homozygous dominant individuals are phenotypically 
identical, the law supports Mendel’s observed 3:1 phenotypic ratio. The 
equal segregation of alleles is the reason we can apply the Punnett square to 
accurately predict the offspring of parents with known genotypes. The 
physical basis of Mendel’s law of segregation is the first division of 
meiosis, in which the homologous chromosomes with their different 
versions of each gene are segregated into daughter nuclei. The role of the 
meiotic segregation of chromosomes in sexual reproduction was not 
understood by the scientific community during Mendel’s lifetime. 


Independent Assortment 


Mendel’s law of independent assortment states that genes do not 
influence each other with regard to the sorting of alleles into gametes, and 
every possible combination of alleles for every gene is equally likely to 
occur. The independent assortment of genes can be illustrated by the 
dihybrid cross, a cross between two true-breeding parents that express 
different traits for two characteristics. Consider the characteristics of seed 
color and seed texture for two pea plants, one that has green, wrinkled seeds 
(yyrr) and another that has yellow, round seeds (YYRR). Because each 
parent is homozygous, the law of segregation indicates that the gametes for 
the green/wrinkled plant all are yr, and the gametes for the yellow/round 
plant are all YR. Therefore, the F, generation of offspring all are YyRr 
([link]). 


Note: 
Art Connection 


This dihybrid cross of pea plants involves 
the genes for seed color and texture. 


In pea plants, purple flowers (P) are dominant to white flowers (p) and 
yellow peas (Y) are dominant to green peas (y). What are the possible 
genotypes and phenotypes for a cross between PpYY and ppYy pea plants? 
How many squares do you need to do a Punnett square analysis of this 
cross? 


For the F2 generation, the law of segregation requires that each gamete 
receive either an R allele or an r allele along with either a Y allele or a y 
allele. The law of independent assortment states that a gamete into which an 
r allele sorted would be equally likely to contain either a Y allele or a y 
allele. Thus, there are four equally likely gametes that can be formed when 
the YyRr heterozygote is self-crossed, as follows: YR, Yr, yR, and yr. 
Arranging these gametes along the top and left of a 4 x 4 Punnett square 
({link]) gives us 16 equally likely genotypic combinations. From these 
genotypes, we infer a phenotypic ratio of 9 round/yellow:3 round/green:3 


wrinkled/yellow:1 wrinkled/green ([link]). These are the offspring ratios we 
would expect, assuming we performed the crosses with a large enough 
sample size. 


Because of independent assortment and dominance, the 9:3:3:1 dihybrid 
phenotypic ratio can be collapsed into two 3:1 ratios, characteristic of any 
monohybrid cross that follows a dominant and recessive pattern. Ignoring 
seed color and considering only seed texture in the above dihybrid cross, we 
would expect that three quarters of the F) generation offspring would be 
round, and one quarter would be wrinkled. Similarly, isolating only seed 
color, we would assume that three quarters of the F> offspring would be 
yellow and one quarter would be green. The sorting of alleles for texture 
and color are independent events, so we can apply the product rule. 
Therefore, the proportion of round and yellow F> offspring is expected to be 
(3/4) x (3/4) = 9/16, and the proportion of wrinkled and green offspring is 
expected to be (1/4) x (1/4) = 1/16. These proportions are identical to those 
obtained using a Punnett square. Round, green and wrinkled, yellow 
offspring can also be calculated using the product rule, as each of these 
genotypes includes one dominant and one recessive phenotype. Therefore, 
the proportion of each is calculated as (3/4) x (1/4) = 3/16. 


The law of independent assortment also indicates that a cross between 
yellow, wrinkled (YYrr) and green, round (yyRR) parents would yield the 
same F, and F> offspring as in the YYRR x yyrr cross. 


The physical basis for the law of independent assortment also lies in 
meiosis I, in which the different homologous pairs line up in random 
orientations. Each gamete can contain any combination of paternal and 
maternal chromosomes (and therefore the genes on them) because the 
orientation of tetrads on the metaphase plane is random. 


Forked-Line Method 


When more than two genes are being considered, the Punnett-square 
method becomes unwieldy. For instance, examining a cross involving four 
genes would require a 16 x 16 grid containing 256 boxes. It would be 


extremely cumbersome to manually enter each genotype. For more complex 
crosses, the forked-line and probability methods are preferred. 


To prepare a forked-line diagram for a cross between F, heterozygotes 
resulting from a cross between AABBCC and aabbcc parents, we first create 
rows equal to the number of genes being considered, and then segregate the 
alleles in each row on forked lines according to the probabilities for 
individual monohybrid crosses ([link]). We then multiply the values along 
each forked path to obtain the F, offspring probabilities. Note that this 
process is a diagrammatic version of the product rule. The values along 
each forked pathway can be multiplied because each gene assorts 
independently. For a trihybrid cross, the F) phenotypic ratio is 

27 INL, 


3 yellow 1 green 
1 wrinkled 1 wrinkled 


3 tall 1 dwarf 3 tall 1 dwarf 3 tall 1 dwarf 3 tall 1 dwarf 


§ wv ¥ Fs & FF ¥ Fg 


3x3x3= 3x3x1= 3x1x3= 3x1x1= 1x3x3= 1x3x1= 1x1x3= 1x1x1= 

27 yellow 9 yellow 9 yellow 3 yellow 9 green 3 green 3 green 1 green 
round round wrinkled wrinkled round round wrinkled wrinkled 
tall dwarf tall dwarf tall dwarf I dwarf 


The forked-line method can be used to analyze a trihybrid 
cross. Here, the probability for color in the F generation 
occupies the top row (3 yellow:1 green). The probability for 
shape occupies the second row (3 round:1 wrinked), and the 
probability for height occupies the third row (3 tall:1 dwarf). 
The probability for each possible combination of traits is 
calculated by multiplying the probability for each individual 
trait. Thus, the probability of F, offspring having yellow, 
round, and tall traits is 3 x 3 x 3, or 27. 


Probability Method 


While the forked-line method is a diagrammatic approach to keeping track 
of probabilities in a cross, the probability method gives the proportions of 
offspring expected to exhibit each phenotype (or genotype) without the 
added visual assistance. Both methods make use of the product rule and 
consider the alleles for each gene separately. Earlier, we examined the 
phenotypic proportions for a trihybrid cross using the forked-line method; 
now we will use the probability method to examine the genotypic 
proportions for a cross with even more genes. 


For a trihybrid cross, writing out the forked-line method is tedious, albeit 
not as tedious as using the Punnett-square method. To fully demonstrate the 
power of the probability method, however, we can consider specific genetic 
calculations. For instance, for a tetrahybrid cross between individuals that 
are heterozygotes for all four genes, and in which all four genes are sorting 
independently and in a dominant and recessive pattern, what proportion of 
the offspring will be expected to be homozygous recessive for all four 
alleles? Rather than writing out every possible genotype, we can use the 
probability method. We know that for each gene, the fraction of 
homozygous recessive offspring will be 1/4. Therefore, multiplying this 
fraction for each of the four genes, (1/4) x (1/4) x (1/4) x (1/4), we 
determine that 1/256 of the offspring will be quadruply homozygous 
recessive. 


For the same tetrahybrid cross, what is the expected proportion of offspring 
that have the dominant phenotype at all four loci? We can answer this 
question using phenotypic proportions, but let’s do it the hard way—using 
genotypic proportions. The question asks for the proportion of offspring 
that are 1) homozygous dominant at A or heterozygous at A, and 2) 
homozygous at B or heterozygous at B, and so on. Noting the “or” and 
“and” in each circumstance makes clear where to apply the sum and 
product rules. The probability of a homozygous dominant at A is 1/4 and 
the probability of a heterozygote at A is 1/2. The probability of the 
homozygote or the heterozygote is 1/4 + 1/2 = 3/4 using the sum rule. The 
same probability can be obtained in the same way for each of the other 
genes, so that the probability of a dominant phenotype at A and B and C and 


Dis, using the product rule, equal to 3/4 x 3/4 x 3/4 x 3/4, or 27/64. If you 
are ever unsure about how to combine probabilities, returning to the forked- 
line method should make it clear. 


Rules for Multihybrid Fertilization 


Predicting the genotypes and phenotypes of offspring from given crosses is 
the best way to test your knowledge of Mendelian genetics. Given a 
multihybrid cross that obeys independent assortment and follows a 
dominant and recessive pattern, several generalized rules exist; you can use 
these rules to check your results as you work through genetics calculations 
({link]). To apply these rules, first you must determine n, the number of 
heterozygous gene pairs (the number of genes segregating two alleles each). 
For example, a cross between AaBb and AaBb heterozygotes has an n of 2. 
In contrast, a cross between AABb and AABb has an n of 1 because A is not 
heterozygous. 


General Rules for Multihybrid Crosses 


Number of 
Heterozygous 
General Rule Gene Pairs 
Number of different F; gametes a 
Number of different F, genotypes a 
Given dominant and recessive inheritance, on 


the number of different F, phenotypes 


Linked Genes Violate the Law of Independent Assortment 


Although all of Mendel’s pea characteristics behaved according to the law 
of independent assortment, we now know that some allele combinations are 
not inherited independently of each other. Genes that are located on 
separate non-homologous chromosomes will always sort independently. 
However, each chromosome contains hundreds or thousands of genes, 
organized linearly on chromosomes like beads on a string. The segregation 
of alleles into gametes can be influenced by linkage, in which genes that 
are located physically close to each other on the same chromosome are 
more likely to be inherited as a pair. However, because of the process of 
recombination, or “crossover,” it is possible for two genes on the same 
chromosome to behave independently, or as if they are not linked. To 
understand this, let’s consider the biological basis of gene linkage and 
recombination. 


Homologous chromosomes possess the same genes in the same linear order. 
The alleles may differ on homologous chromosome pairs, but the genes to 
which they correspond do not. In preparation for the first division of 
meiosis, homologous chromosomes replicate and synapse. Like genes on 
the homologs align with each other. At this stage, segments of homologous 
chromosomes exchange linear segments of genetic material ({link]). This 
process is called recombination, or crossover, and it is a common genetic 
process. Because the genes are aligned during recombination, the gene 
order is not altered. Instead, the result of recombination is that maternal and 
paternal alleles are combined onto the same chromosome. Across a given 
chromosome, several recombination events may occur, causing extensive 
shuffling of alleles. 


Homologous Chromosome 
chromosomes crossover 
aligned 


Recombinant 
chromosomes 


Non-recombinant 
chromosomes 


The process of crossover, or 
recombination, occurs when 
two homologous 
chromosomes align during 
meiosis and exchange a 
segment of genetic material. 
Here, the alleles for gene C 
were exchanged. The result is 
two recombinant and two 
non-recombinant 
chromosomes. 


When two genes are located in close proximity on the same chromosome, 
they are considered linked, and their alleles tend to be transmitted through 
meiosis together. To exemplify this, imagine a dihybrid cross involving 

flower color and plant height in which the genes are next to each other on 


the chromosome. If one homologous chromosome has alleles for tall plants 
and red flowers, and the other chromosome has genes for short plants and 
yellow flowers, then when the gametes are formed, the tall and red alleles 
will go together into a gamete and the short and yellow alleles will go into 
other gametes. These are called the parental genotypes because they have 
been inherited intact from the parents of the individual producing gametes. 
But unlike if the genes were on different chromosomes, there will be no 
gametes with tall and yellow alleles and no gametes with short and red 
alleles. If you create the Punnett square with these gametes, you will see 
that the classical Mendelian prediction of a 9:3:3:1 outcome of a dihybrid 
cross would not apply. As the distance between two genes increases, the 
probability of one or more crossovers between them increases, and the 
genes behave more like they are on separate chromosomes. Geneticists have 
used the proportion of recombinant gametes (the ones not like the parents) 
as a measure of how far apart genes are on a chromosome. Using this 
information, they have constructed elaborate maps of genes on 
chromosomes for well-studied organisms, including humans. 


Mendel’s seminal publication makes no mention of linkage, and many 
researchers have questioned whether he encountered linkage but chose not 
to publish those crosses out of concern that they would invalidate his 
independent assortment postulate. The garden pea has seven chromosomes, 
and some have suggested that his choice of seven characteristics was not a 
coincidence. However, even if the genes he examined were not located on 
separate chromosomes, it is possible that he simply did not observe linkage 
because of the extensive shuffling effects of recombination. 


Note: 

Scientific Method Connection 

Testing the Hypothesis of Independent Assortment 

To better appreciate the amount of labor and ingenuity that went into 
Mendel’s experiments, proceed through one of Mendel’s dihybrid crosses. 
Question: What will be the offspring of a dihybrid cross? 

Background: Consider that pea plants mature in one growing season, and 
you have access to a large garden in which you can cultivate thousands of 


pea plants. There are several true-breeding plants with the following pairs 
of traits: tall plants with inflated pods, and dwarf plants with constricted 
pods. Before the plants have matured, you remove the pollen-producing 
organs from the tall/inflated plants in your crosses to prevent self- 
fertilization. Upon plant maturation, the plants are manually crossed by 
transferring pollen from the dwarf/constricted plants to the stigmata of the 
tall/inflated plants. 

Hypothesis: Both trait pairs will sort independently according to 
Mendelian laws. When the true-breeding parents are crossed, all of the F; 
offspring are tall and have inflated pods, which indicates that the tall and 
inflated traits are dominant over the dwarf and constricted traits, 
respectively. A self-cross of the F, heterozygotes results in 2,000 F> 
progeny. 

Test the hypothesis: Because each trait pair sorts independently, the ratios 
of tall:dwarf and inflated:constricted are each expected to be 3:1. The 
tall/dwarf trait pair is called T/t, and the inflated/constricted trait pair is 
designated I/i. Each member of the F, generation therefore has a genotype 
of Ttli. Construct a grid analogous to [link], in which you cross two Ttli 
individuals. Each individual can donate four combinations of two traits: TT, 
Ti, tI, or ti, meaning that there are 16 possibilities of offspring genotypes. 
Because the T and J alleles are dominant, any individual having one or two 
of those alleles will express the tall or inflated phenotypes, respectively, 
regardless if they also have at or i allele. Only individuals that are tt or ii 
will express the dwarf and constricted alleles, respectively. As shown in 
[link], you predict that you will observe the following offspring 
proportions: tall/inflated:tall/constricted:dwarf/inflated:dwarf/constricted in 
a 9:3:3:1 ratio. Notice from the grid that when considering the tall/dwarf 
and inflated/constricted trait pairs in isolation, they are each inherited in 
3:1 ratios. 


This figure shows all possible 
combinations of offspring 
resulting from a dihybrid cross of 
pea plants that are heterozygous 
for the tall/dwarf and 
inflated/constricted alleles. 


Test the hypothesis: You cross the dwarf and tall plants and then self-cross 
the offspring. For best results, this is repeated with hundreds or even 
thousands of pea plants. What special precautions should be taken in the 
crosses and in growing the plants? 

Analyze your data: You observe the following plant phenotypes in the F> 
generation: 2706 tall/inflated, 930 tall/constricted, 888 dwarf/inflated, and 
300 dwarf/constricted. Reduce these findings to a ratio and determine if 
they are consistent with Mendelian laws. 

Form a conclusion: Were the results close to the expected 9:3:3:1 
phenotypic ratio? Do the results support the prediction? What might be 
observed if far fewer plants were used, given that alleles segregate 
randomly into gametes? Try to imagine growing that many pea plants, and 
consider the potential for experimental error. For instance, what would 
happen if it was extremely windy one day? 


Epistasis 


Mendel’s studies in pea plants implied that the sum of an individual’s 
phenotype was controlled by genes (or as he called them, unit factors), such 
that every characteristic was distinctly and completely controlled by a 
single gene. In fact, single observable characteristics are almost always 
under the influence of multiple genes (each with two or more alleles) acting 
in unison. For example, at least eight genes contribute to eye color in 
humans. 


Note: 
Link to Learning 
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Eye color in humans is determined by multiple genes. Use the Eye Color 
Calculator to predict the eye color of children from parental eye color. 


In some cases, several genes can contribute to aspects of a common 
phenotype without their gene products ever directly interacting. In the case 
of organ development, for instance, genes may be expressed sequentially, 
with each gene adding to the complexity and specificity of the organ. Genes 
may function in complementary or synergistic fashions, such that two or 
more genes need to be expressed simultaneously to affect a phenotype. 
Genes may also oppose each other, with one gene modifying the expression 
of another. 


In epistasis, the interaction between genes is antagonistic, such that one 
gene masks or interferes with the expression of another. “Epistasis” is a 


word composed of Greek roots that mean “standing upon.” The alleles that 
are being masked or silenced are said to be hypostatic to the epistatic alleles 
that are doing the masking. Often the biochemical basis of epistasis is a 
gene pathway in which the expression of one gene is dependent on the 
function of a gene that precedes or follows it in the pathway. 


An example of epistasis is pigmentation in mice. The wild-type coat color, 
agouti (AA), is dominant to solid-colored fur (aa). However, a separate gene 
(C) is necessary for pigment production. A mouse with a recessive c allele 
at this locus is unable to produce pigment and is albino regardless of the 
allele present at locus A ({link]). Therefore, the genotypes AAcc, Aacc, and 
aacc all produce the same albino phenotype. A cross between heterozygotes 
for both genes (AaCc x AaCc) would generate offspring with a phenotypic 
ratio of 9 agouti:3 solid color:4 albino ([link]). In this case, the C gene is 
epistatic to the A gene. 
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In mice, the mottled agouti coat color 
(A) is dominant to a solid coloration, 
such as black or gray. A gene at a 
separate locus (C) is responsible for 
pigment production. The recessive c 
allele does not produce pigment, and a 
mouse with the homozygous recessive 
cc genotype is albino regardless of the 
allele present at the A locus. Thus, the 
C gene is epistatic to the A gene. 


Epistasis can also occur when a dominant allele masks expression at a 
separate gene. Fruit color in summer squash is expressed in this way. 
Homozygous recessive expression of the W gene (ww) coupled with 
homozygous dominant or heterozygous expression of the Y gene (YY or Yy) 
generates yellow fruit, and the wwyy genotype produces green fruit. 
However, if a dominant copy of the W gene is present in the homozygous or 
heterozygous form, the summer squash will produce white fruit regardless 
of the Y alleles. A cross between white heterozygotes for both genes (WwYy 
x WwYy) would produce offspring with a phenotypic ratio of 12 white:3 
yellow:1 green. 


Finally, epistasis can be reciprocal such that either gene, when present in the 
dominant (or recessive) form, expresses the same phenotype. In the 
shepherd’s purse plant (Capsella bursa-pastoris), the characteristic of seed 
shape is controlled by two genes in a dominant epistatic relationship. When 
the genes A and B are both homozygous recessive (aabb), the seeds are 
ovoid. If the dominant allele for either of these genes is present, the result is 
triangular seeds. That is, every possible genotype other than aabb results in 
triangular seeds, and a cross between heterozygotes for both genes (AaBb x 
AaBb) would yield offspring with a phenotypic ratio of 15 triangular:1 
ovoid. 


As you work through genetics problems, keep in mind that any single 
characteristic that results in a phenotypic ratio that totals 16 is typical of a 


two-gene interaction. Recall the phenotypic inheritance pattern for 
Mendel’s dihybrid cross, which considered two non-interacting genes— 
9:3:3:1. Similarly, we would expect interacting gene pairs to also exhibit 
ratios expressed as 16 parts. Note that we are assuming the interacting 
genes are not linked; they are still assorting independently into gametes. 


Note: 
Link to Learning 


Open 


— 
mmm §=<OPENStAX COLLEGE 


Dey iter 


For an excellent review of Mendel’s experiments and to perform your own 
crosses and identify patterns of inheritance, visit the Mendel’s Peas web 
lab. 


Section Summary 


Mendel postulated that genes (characteristics) are inherited as pairs of 
alleles (traits) that behave in a dominant and recessive pattern. Alleles 
segregate into gametes such that each gamete is equally likely to receive 
either one of the two alleles present in a diploid individual. In addition, 
genes are assorted into gametes independently of one another. That is, 
alleles are generally not more likely to segregate into a gamete with a 
particular allele of another gene. A dihybrid cross demonstrates 
independent assortment when the genes in question are on different 
chromosomes or distant from each other on the same chromosome. For 
crosses involving more than two genes, use the forked line or probability 
methods to predict offspring genotypes and phenotypes rather than a 
Punnett square. 


Although chromosomes sort independently into gametes during meiosis, 
Mendel’s law of independent assortment refers to genes, not chromosomes, 
and a single chromosome may carry more than 1,000 genes. When genes 
are located in close proximity on the same chromosome, their alleles tend to 
be inherited together. This results in offspring ratios that violate Mendel's 
law of independent assortment. However, recombination serves to exchange 
genetic material on homologous chromosomes such that maternal and 
paternal alleles may be recombined on the same chromosome. This is why 
alleles on a given chromosome are not always inherited together. 
Recombination is a random event occurring anywhere on a chromosome. 
Therefore, genes that are far apart on the same chromosome are likely to 
still assort independently because of recombination events that occurred in 
the intervening chromosomal space. 


Whether or not they are sorting independently, genes may interact at the 
level of gene products such that the expression of an allele for one gene 
masks or modifies the expression of an allele for a different gene. This is 
called epistasis. 


Art Connections 


Exercise: 


Problem: 


[link] In pea plants, purple flowers (P) are dominant to white flowers 
(p) and yellow peas (Y) are dominant to green peas (y). What are the 
possible genotypes and phenotypes for a cross between PpYY and 
ppYy pea plants? How many squares do you need to do a Punnett 
square analysis of this cross? 


Solution: 


[link] The possible genotypes are PpYY, PpYy, ppYY, and ppYy. The 
former two genotypes would result in plants with purple flowers and 
yellow peas, while the latter two genotypes would result in plants with 
white flowers with yellow peas, for a 1:1 ratio of each phenotype. You 


only need a 2 x 2 Punnett square (four squares total) to do this analysis 
because two of the alleles are homozygous. 


Multiple Choice 


Exercise: 
Problem: 
Assuming no gene linkage, in a dihybrid cross of AABB x aabb with 


AaBb F; heterozygotes, what is the ratio of the F; gametes (AB, aB, 
Ab, ab) that will give rise to the F> offspring? 
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Solution: 


A 
Exercise: 


Problem: 


The forked line and probability methods make use of what probability 
rule? 


a. test cross 

b. product rule 

c. monohybrid rule 
d. sum rule 


Solution: 


B 


Exercise: 
Problem: 
How many different offspring genotypes are expected in a trihybrid 


cross between parents heterozygous for all three traits when the traits 
behave in a dominant and recessive pattern? How many phenotypes? 


a. 64 genotypes; 16 phenotypes 
b. 16 genotypes; 64 phenotypes 
c. 8 genotypes; 27 phenotypes 
d. 27 genotypes; 8 phenotypes 


Solution: 


D 


Free Response 


Exercise: 
Problem: 


Use the probability method to calculate the genotypes and genotypic 
proportions of a cross between AABBCc and Aabbcc parents. 


Solution: 


Considering each gene separately, the cross at A will produce offspring 
of which half are AA and half are Aa; B will produce all Bb; C will 
produce half Cc and half cc. Proportions then are (1/2) x (1) x (1/2), or 
1/4 AABbCc; continuing for the other possibilities yields 1/4 AABbcc, 
1/4 AaBbCc, and 1/4 AaBbcc. The proportions therefore are 1:1:1:1. 


Exercise: 


Problem: 
Explain epistatis in terms of its Greek-language roots “standing upon.” 
Solution: 


Epistasis describes an antagonistic interaction between genes wherein 
one gene masks or interferes with the expression of another. The gene 
that is interfering is referred to as epistatic, as if it is “standing upon” 
the other (hypostatic) gene to block its expression. 


Exercise: 


Problem: 


In Section 12.3, “Laws of Inheritance,” an example of epistasis was 
given for the summer squash. Cross white WwYy heterozygotes to 
prove the phenotypic ratio of 12 white:3 yellow:1 green that was given 
in the text. 


Solution: 


The cross can be represented as a 4 x 4 Punnett square, with the 
following gametes for each parent: WY, W, wy, and wy. For all 12 of 
the offspring that express a dominant W gene, the offspring will be 
white. The three offspring that are homozygous recessive for w but 
express a dominant Y gene will be yellow. The remaining wwyy 
offspring will be green. 


Glossary 


dihybrid 
result of a cross between two true-breeding parents that express 
different traits for two characteristics 


epistasis 
antagonistic interaction between genes such that one gene masks or 
interferes with the expression of another 


law of dominance 
in a heterozygote, one trait will conceal the presence of another trait 
for the same characteristic 


law of independent assortment 
genes do not influence each other with regard to sorting of alleles into 
gametes; every possible combination of alleles is equally likely to 
occur 


law of segregation 
paired unit factors (i.e., genes) segregate equally into gametes such 
that offspring have an equal likelihood of inheriting any combination 
of factors 


linkage 
phenomenon in which alleles that are located in close proximity to 
each other on the same chromosome are more likely to be inherited 
together 


Introduction 
class="introduction" 


Each of us, 
like these 
other large 
multicellula 
r organisms, 
begins life 
asa 
fertilized 
egg. After 
trillions of 
cell 
divisions, 
each of us 
develops 
into a 
complex, 
multicellula 
r organism. 
(credit a: 
modificatio 
n of work 
by Frank 
Wouters; 
credit b: 
modificatio 
n of work 
by Ken 
Cole, 
USGS; 
credit c: 
modificatio 
n of work 
by Martin 
Pettitt) 


The ability to reproduce in kind is a basic characteristic of all living things. 
In kind means that the offspring of any organism closely resemble their 
parent or parents. Hippopotamuses give birth to hippopotamus calves, 
Joshua trees produce seeds from which Joshua tree seedlings emerge, and 
adult flamingos lay eggs that hatch into flamingo chicks. In kind does not 
generally mean exactly the same. Whereas many unicellular organisms and 
a few multicellular organisms can produce genetically identical clones of 
themselves through cell division, many single-celled organisms and most 
multicellular organisms reproduce regularly using another method. Sexual 
reproduction is the production by parents of two haploid cells and the 
fusion of two haploid cells to form a single, unique diploid cell. In most 
plants and animals, through tens of rounds of mitotic cell division, this 
diploid cell will develop into an adult organism. Haploid cells that are part 
of the sexual reproductive cycle are produced by a type of cell division 
called meiosis. Sexual reproduction, specifically meiosis and fertilization, 
introduces variation into offspring that may account for the evolutionary 
success of sexual reproduction. The vast majority of eukaryotic organisms, 
both multicellular and unicellular, can or must employ some form of 
meiosis and fertilization to reproduce. 


The Process of Meiosis 
By the end of this section, you will be able to: 


¢ Describe the behavior of chromosomes during meiosis 

e Describe cellular events during meiosis 

e Explain the differences between meiosis and mitosis 

e Explain the mechanisms within meiosis that generate genetic variation 
among the products of meiosis 


Sexual reproduction requires fertilization, the union of two cells from two 
individual organisms. If those two cells each contain one set of 
chromosomes, then the resulting cell contains two sets of chromosomes. 
Haploid cells contain one set of chromosomes. Cells containing two sets of 
chromosomes are called diploid. The number of sets of chromosomes in a 
cell is called its ploidy level. If the reproductive cycle is to continue, then 
the diploid cell must somehow reduce its number of chromosome sets 
before fertilization can occur again, or there will be a continual doubling in 
the number of chromosome sets in every generation. So, in addition to 
fertilization, sexual reproduction includes a nuclear division that reduces 
the number of chromosome sets. 


Most animals and plants are diploid, containing two sets of chromosomes. 
In each somatic cell of the organism (all cells of a multicellular organism 
except the gametes or reproductive cells), the nucleus contains two copies 
of each chromosome, called homologous chromosomes. Somatic cells are 
sometimes referred to as “body” cells. Homologous chromosomes are 
matched pairs containing the same genes in identical locations along their 
length. Diploid organisms inherit one copy of each homologous 
chromosome from each parent; all together, they are considered a full set of 
chromosomes. Haploid cells, containing a single copy of each homologous 
chromosome, are found only within structures that give rise to either 
gametes or spores. Spores are haploid cells that can produce a haploid 
organism or can fuse with another spore to form a diploid cell. All animals 
and most plants produce eggs and sperm, or gametes. Some plants and all 
fungi produce spores. 


The nuclear division that forms haploid cells, which is called meiosis, is 
related to mitosis. As you have learned, mitosis is the part of a cell 


reproduction cycle that results in identical daughter nuclei that are also 
genetically identical to the original parent nucleus. In mitosis, both the 
parent and the daughter nuclei are at the same ploidy level—diploid for 
most plants and animals. Meiosis employs many of the same mechanisms 
as mitosis. However, the starting nucleus is always diploid and the nuclei 
that result at the end of a meiotic cell division are haploid. To achieve this 
reduction in chromosome number, meiosis consists of one round of 
chromosome duplication and two rounds of nuclear division. Because the 
events that occur during each of the division stages are analogous to the 
events of mitosis, the same stage names are assigned. However, because 
there are two rounds of division, the major process and the stages are 
designated with a “I” or a “II.” Thus, meiosis I is the first round of meiotic 
division and consists of prophase I, prometaphase I, and so on. Meiosis IT, 
in which the second round of meiotic division takes place, includes 
prophase II, prometaphase II, and so on. 


Meiosis I 


Meiosis is preceded by an interphase consisting of the G,, S, and Gp phases, 
which are nearly identical to the phases preceding mitosis. The G, phase, 
which is also called the first gap phase, is the first phase of the interphase 
and is focused on cell growth. The S phase is the second phase of 
interphase, during which the DNA of the chromosomes is replicated. 
Finally, the G> phase, also called the second gap phase, is the third and final 
phase of interphase; in this phase, the cell undergoes the final preparations 
for meiosis. 


During DNA duplication in the S phase, each chromosome is replicated to 
produce two identical copies, called sister chromatids, that are held together 
at the centromere by cohesin proteins. Cohesin holds the chromatids 
together until anaphase II. The centrosomes, which are the structures that 
organize the microtubules of the meiotic spindle, also replicate. This 
prepares the cell to enter prophase I, the first meiotic phase. 


Prophase I 


Early in prophase I, before the chromosomes can be seen clearly 
microscopically, the homologous chromosomes are attached at their tips to 
the nuclear envelope by proteins. As the nuclear envelope begins to break 
down, the proteins associated with homologous chromosomes bring the pair 
close to each other. Recall that, in mitosis, homologous chromosomes do 
not pair together. In mitosis, homologous chromosomes line up end-to-end 
so that when they divide, each daughter cell receives a sister chromatid 
from both members of the homologous pair. The synaptonemal complex, a 
lattice of proteins between the homologous chromosomes, first forms at 
specific locations and then spreads to cover the entire length of the 
chromosomes. The tight pairing of the homologous chromosomes is called 
synapsis. In synapsis, the genes on the chromatids of the homologous 
chromosomes are aligned precisely with each other. The synaptonemal 
complex supports the exchange of chromosomal segments between non- 
sister homologous chromatids, a process called crossing over. Crossing over 
can be observed visually after the exchange as chiasmata (singular = 
chiasma) ({link]). 


In species such as humans, even though the X and Y sex chromosomes are 
not homologous (most of their genes differ), they have a small region of 
homology that allows the X and Y chromosomes to pair up during prophase 
I. A partial synaptonemal complex develops only between the regions of 
homology. 


Homologous 
chromosomes 


Centromere 


Kinetochore 


Synaptonemal complex 


Sister chromatids 


Early in prophase I, homologous 
chromosomes come together to 
form a synapse. The chromosomes 
are bound tightly together and in 
perfect alignment by a protein 
lattice called a synaptonemal 
complex and by cohesin proteins 
at the centromere. 


Located at intervals along the synaptonemal complex are large protein 
assemblies called recombination nodules. These assemblies mark the 
points of later chiasmata and mediate the multistep process of crossover— 
or genetic recombination—between the non-sister chromatids. Near the 
recombination nodule on each chromatid, the double-stranded DNA is 
cleaved, the cut ends are modified, and a new connection is made between 
the non-sister chromatids. As prophase I progresses, the synaptonemal 
complex begins to break down and the chromosomes begin to condense. 
When the synaptonemal complex is gone, the homologous chromosomes 
remain attached to each other at the centromere and at chiasmata. The 
chiasmata remain until anaphase I. The number of chiasmata varies 
according to the species and the length of the chromosome. There must be 
at least one chiasma per chromosome for proper separation of homologous 
chromosomes during meiosis I, but there may be as many as 25. Following 
crossover, the synaptonemal complex breaks down and the cohesin 
connection between homologous pairs is also removed. At the end of 
prophase I, the pairs are held together only at the chiasmata ({link]) and are 
called tetrads because the four sister chromatids of each pair of 
homologous chromosomes are now visible. 


The crossover events are the first source of genetic variation in the nuclei 
produced by meiosis. A single crossover event between homologous non- 
sister chromatids leads to a reciprocal exchange of equivalent DNA 
between a maternal chromosome and a paternal chromosome. Now, when 
that sister chromatid is moved into a gamete cell it will carry some DNA 
from one parent of the individual and some DNA from the other parent. The 


sister recombinant chromatid has a combination of maternal and paternal 
genes that did not exist before the crossover. Multiple crossovers in an arm 
of the chromosome have the same effect, exchanging segments of DNA to 
create recombinant chromosomes. 


Homologous Chromatid 
chromosomes crossover 
aligned 


Recombinant 
chromatids 


Non-recombinant 
chromosomes 


Crossover occurs between non-sister 
chromatids of homologous 
chromosomes. The result is an 
exchange of genetic material between 
homologous chromosomes. 


Prometaphase I 


The key event in prometaphase I is the attachment of the spindle fiber 
microtubules to the kinetochore proteins at the centromeres. Kinetochore 
proteins are multiprotein complexes that bind the centromeres of a 
chromosome to the microtubules of the mitotic spindle. Microtubules grow 
from centrosomes placed at opposite poles of the cell. The microtubules 
move toward the middle of the cell and attach to one of the two fused 
homologous chromosomes. The microtubules attach at each chromosomes' 
kinetochores. With each member of the homologous pair attached to 
opposite poles of the cell, in the next phase, the microtubules can pull the 
homologous pair apart. A spindle fiber that has attached to a kinetochore is 
called a kinetochore microtubule. At the end of prometaphase I, each tetrad 
is attached to microtubules from both poles, with one homologous 
chromosome facing each pole. The homologous chromosomes are still held 
together at chiasmata. In addition, the nuclear membrane has broken down 
entirely. 


Metaphase I 


During metaphase I, the homologous chromosomes are arranged in the 
center of the cell with the kinetochores facing opposite poles. The 
homologous pairs orient themselves randomly at the equator. For example, 
if the two homologous members of chromosome 1 are labeled a and b, then 
the chromosomes could line up a-b, or b-a. This is important in determining 
the genes carried by a gamete, as each will only receive one of the two 
homologous chromosomes. Recall that homologous chromosomes are not 
identical. They contain slight differences in their genetic information, 
causing each gamete to have a unique genetic makeup. 


This randomness is the physical basis for the creation of the second form of 
genetic variation in offspring. Consider that the homologous chromosomes 
of a sexually reproducing organism are originally inherited as two separate 
sets, one from each parent. Using humans as an example, one set of 23 
chromosomes is present in the egg donated by the mother. The father 
provides the other set of 23 chromosomes in the sperm that fertilizes the 


egg. Every cell of the multicellular offspring has copies of the original two 
sets of homologous chromosomes. In prophase I of meiosis, the 
homologous chromosomes form the tetrads. In metaphase I, these pairs line 
up at the midway point between the two poles of the cell to form the 
metaphase plate. Because there is an equal chance that a microtubule fiber 
will encounter a maternally or paternally inherited chromosome, the 
arrangement of the tetrads at the metaphase plate is random. Any 
maternally inherited chromosome may face either pole. Any paternally 
inherited chromosome may also face either pole. The orientation of each 
tetrad is independent of the orientation of the other 22 tetrads. 


This event—the random (or independent) assortment of homologous 
chromosomes at the metaphase plate—is the second mechanism that 
introduces variation into the gametes or spores. In each cell that undergoes 
meiosis, the arrangement of the tetrads is different. The number of 
variations is dependent on the number of chromosomes making up a set. 
There are two possibilities for orientation at the metaphase plate; the 
possible number of alignments therefore equals 2n, where n is the number 
of chromosomes per set. Humans have 23 chromosome pairs, which results 
in over eight million (27°) possible genetically-distinct gametes. This 
number does not include the variability that was previously created in the 
sister chromatids by crossover. Given these two mechanisms, it is highly 
unlikely that any two haploid cells resulting from meiosis will have the 
Same genetic composition ([link]). 


To summarize the genetic consequences of meiosis I, the maternal and 
paternal genes are recombined by crossover events that occur between each 
homologous pair during prophase I. In addition, the random assortment of 
tetrads on the metaphase plate produces a unique combination of maternal 
and paternal chromosomes that will make their way into the gametes. 
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Random, independent assortment during metaphase I can be 
demonstrated by considering a cell with a set of two 
chromosomes (n = 2). In this case, there are two possible 
arrangements at the equatorial plane in metaphase I. The total 
possible number of different gametes is 2n, where n equals the 
number of chromosomes in a set. In this example, there are 
four possible genetic combinations for the gametes. With n = 
23 in human cells, there are over 8 million possible 
combinations of paternal and maternal chromosomes. 


Anaphase I 


In anaphase I, the microtubules pull the linked chromosomes apart. The 
sister chromatids remain tightly bound together at the centromere. The 
chiasmata are broken in anaphase I as the microtubules attached to the 
fused kinetochores pull the homologous chromosomes apart ([link]). 


Telophase I and Cytokinesis 


In telophase, the separated chromosomes arrive at opposite poles. The 
remainder of the typical telophase events may or may not occur, depending 
on the species. In some organisms, the chromosomes decondense and 
nuclear envelopes form around the chromatids in telophase I. In other 
organisms, cytokinesis—the physical separation of the cytoplasmic 
components into two daughter cells—occurs without reformation of the 
nuclei. In nearly all species of animals and some fungi, cytokinesis 
separates the cell contents via a cleavage furrow (constriction of the actin 
ring that leads to cytoplasmic division). In plants, a cell plate is formed 
during cell cytokinesis by Golgi vesicles fusing at the metaphase plate. This 
cell plate will ultimately lead to the formation of cell walls that separate the 
two daughter cells. 


Two haploid cells are the end result of the first meiotic division. The cells 
are haploid because at each pole, there is just one of each pair of the 
homologous chromosomes. Therefore, only one full set of the chromosomes 
is present. This is why the cells are considered haploid—there is only one 
chromosome set, even though each homolog still consists of two sister 
chromatids. Recall that sister chromatids are merely duplicates of one of the 
two homologous chromosomes (except for changes that occurred during 
crossing over). In meiosis II, these two sister chromatids will separate, 
creating four haploid daughter cells. 


Note: 
Link to Learning 
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Review the process of meiosis, observing how chromosomes align and 
migrate, at Meiosis: An Interactive Animation. 


Meiosis II 


In some species, cells enter a brief interphase, or interkinesis, before 
entering meiosis II. Interkinesis lacks an S phase, so chromosomes are not 
duplicated. The two cells produced in meiosis I go through the events of 
meiosis IJ in synchrony. During meiosis II, the sister chromatids within the 
two daughter cells separate, forming four new haploid gametes. The 
mechanics of meiosis II is similar to mitosis, except that each dividing cell 
has only one set of homologous chromosomes. Therefore, each cell has half 
the number of sister chromatids to separate out as a diploid cell undergoing 
mitosis. 


Prophase II 


If the chromosomes decondensed in telophase I, they condense again. If 
nuclear envelopes were formed, they fragment into vesicles. The 
centrosomes that were duplicated during interkinesis move away from each 
other toward opposite poles, and new spindles are formed. 


Prometaphase II 


The nuclear envelopes are completely broken down, and the spindle is fully 
formed. Each sister chromatid forms an individual kinetochore that attaches 


to microtubules from opposite poles. 


Metaphase II 


The sister chromatids are maximally condensed and aligned at the equator 
of the cell. 


Anaphase II 


The sister chromatids are pulled apart by the kinetochore microtubules and 
move toward opposite poles. Non-kinetochore microtubules elongate the 
cell. 
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The process of chromosome alignment differs between meiosis 
I and meiosis II. In prometaphase I, microtubules attach to the 
fused kinetochores of homologous chromosomes, and the 
homologous chromosomes are arranged at the midpoint of the 
cell in metaphase I. In anaphase I, the homologous 
chromosomes are separated. In prometaphase II, microtubules 
attach to the kinetochores of sister chromatids, and the sister 
chromatids are arranged at the midpoint of the cells in 
metaphase II. In anaphase II, the sister chromatids are 
separated. 


Telophase II and Cytokinesis 


The chromosomes atrive at opposite poles and begin to decondense. 
Nuclear envelopes form around the chromosomes. Cytokinesis separates the 
two cells into four unique haploid cells. At this point, the newly formed 
nuclei are both haploid. The cells produced are genetically unique because 
of the random assortment of paternal and maternal homologs and because 
of the recombining of maternal and paternal segments of chromosomes 
(with their sets of genes) that occurs during crossover. The entire process of 
meiosis is outlined in [link]. 
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Chromosomes condense, and the nuclear envelope 
fragments. Homologous chromosomes bind firmly 
together along their length, forming a tetrad. 
Chiasmata form between non-sister chromatids. 
Crossing over occurs at the chiasmata. Spindle fibers 
emerge from the centrosomes. 


Homologous chromosomes are attached to spindle 
microtubules at the fused kinetochore shared by 
the sister chromatids. Chromosomes continue to 
condense, and the nuclear envelope completely 
disappears. 


Homologous chromosomes randomly assemble at the 
metaphase plate, where they have been maneuvered 
into place by the microtubules. 


Spindle microtubules pull the homologous 
chromosomes apart. The sister chromatids are still 
attached at the centromere. 


Sister chromatids arrive at the poles of the cell and 
begin to decondense. A nuclear envelope forms 
around each nucleus and the cytoplasm is divided by 
a Cleavage furrow. The result is two haploid cells. 
Each cell contains one duplicated copy of each 
homologous chromosome pair. 


Sister chromatids condense. A new spindle begins to 
form. The nuclear envelope starts to fragment. 


The nuclear envelope disappears, and the spindle 
fibers engage the individual kinetochores on the 
sister chromatids. 


Sister chromatids line up at the metaphase plate. 


Sister chromatids are pulled apart by the shortening 
of the kinetochore microtubules. Non-kinetochore 
microtubules lengthen the cell. 


Chromosomes arrive at the poles of the cell and 
decondense. Nuclear envelopes surround the four 
nuclei. Cleavage furrows divide the two cells into 
four haploid cells. 


An animal cell with a diploid number of four (2n = 4) 
proceeds through the stages of meiosis to form four 
haploid daughter cells. 


Comparing Meiosis and Mitosis 


Mitosis and meiosis are both forms of division of the nucleus in eukaryotic 
cells. They share some similarities, but also exhibit distinct differences that 
lead to very different outcomes ([link]). Mitosis is a single nuclear division 
that results in two nuclei that are usually partitioned into two new cells. The 
nuclei resulting from a mitotic division are genetically identical to the 
original nucleus. They have the same number of sets of chromosomes, one 
set in the case of haploid cells and two sets in the case of diploid cells. In 
most plants and all animal species, it is typically diploid cells that undergo 
mitosis to form new diploid cells. In contrast, meiosis consists of two 
nuclear divisions resulting in four nuclei that are usually partitioned into 
four new cells. The nuclei resulting from meiosis are not genetically 
identical and they contain one chromosome set only. This is half the number 
of chromosome sets in the original cell, which is diploid. 


The main differences between mitosis and meiosis occur in meiosis I, which 
is a very different nuclear division than mitosis. In meiosis I, the 
homologous chromosome pairs become associated with each other, are 
bound together with the synaptonemal complex, develop chiasmata and 
undergo crossover between sister chromatids, and line up along the 
metaphase plate in tetrads with kinetochore fibers from opposite spindle 
poles attached to each kinetochore of a homolog in a tetrad. All of these 
events occur only in meiosis I. 


When the chiasmata resolve and the tetrad is broken up with the homologs 
moving to one pole or another, the ploidy level—the number of sets of 
chromosomes in each future nucleus—has been reduced from two to one. 
For this reason, meiosis I is referred to as a reduction division. There is no 
such reduction in ploidy level during mitosis. 


Meiosis II is much more analogous to a mitotic division. In this case, the 
duplicated chromosomes (only one set of them) line up on the metaphase 
plate with divided kinetochores attached to kinetochore fibers from opposite 
poles. During anaphase II, as in mitotic anaphase, the kinetochores divide 
and one sister chromatid—now referred to as a chromosome—is pulled to 
one pole while the other sister chromatid is pulled to the other pole. If it 


were not for the fact that there had been crossover, the two products of each 
individual meiosis II division would be identical (like in mitosis). Instead, 
they are different because there has always been at least one crossover per 
chromosome. Meiosis II is not a reduction division because although there 
are fewer copies of the genome in the resulting cells, there is still one set of 
chromosomes, as there was at the end of meiosis I. 
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Meiosis and mitosis are both preceded by one round of 
DNA replication; however, meiosis includes two nuclear 
divisions. The four daughter cells resulting from meiosis 

are haploid and genetically distinct. The daughter cells 

resulting from mitosis are diploid and identical to the 
parent cell. 


Note: 

Evolution Connection 

The Mystery of the Evolution of Meiosis 

Some characteristics of organisms are so widespread and fundamental that 
it is sometimes difficult to remember that they evolved like other simpler 
traits. Meiosis is such an extraordinarily complex series of cellular events 
that biologists have had trouble hypothesizing and testing how it may have 
evolved. Although meiosis is inextricably entwined with sexual 
reproduction and its advantages and disadvantages, it is important to 
separate the questions of the evolution of meiosis and the evolution of sex, 
because early meiosis may have been advantageous for different reasons 
than it is now. Thinking outside the box and imagining what the early 
benefits from meiosis might have been is one approach to uncovering how 
it may have evolved. 

Meiosis and mitosis share obvious cellular processes and it makes sense 
that meiosis evolved from mitosis. The difficulty lies in the clear 
differences between meiosis I and mitosis. Adam Wilkins and Robin 
Holliday!o™2te! summarized the unique events that needed to occur for the 
evolution of meiosis from mitosis. These steps are homologous 
chromosome pairing, crossover exchanges, sister chromatids remaining 
attached during anaphase, and suppression of DNA replication in 
interphase. They argue that the first step is the hardest and most important, 
and that understanding how it evolved would make the evolutionary 
process clearer. They suggest genetic experiments that might shed light on 
the evolution of synapsis. 

Adam S. Wilkins and Robin Holliday, “The Evolution of Meiosis from 
Mitosis,” Genetics 181 (2009): 3-12. 

There are other approaches to understanding the evolution of meiosis in 
progress. Different forms of meiosis exist in single-celled protists. Some 
appear to be simpler or more “primitive” forms of meiosis. Comparing the 
meiotic divisions of different protists may shed light on the evolution of 
meiosis. Marilee Ramesh and colleagues! !°™°'e] compared the genes 
involved in meiosis in protists to understand when and where meiosis 
might have evolved. Although research is still ongoing, recent scholarship 


into meiosis in protists suggests that some aspects of meiosis may have 
evolved later than others. This kind of genetic comparison can tell us what 
aspects of meiosis are the oldest and what cellular processes they may have 
borrowed from in earlier cells. 

Marilee A. Ramesh, Shehre-Banoo Malik and John M. Logsdon, Jr, “A 
Phylogenetic Inventory of Meiotic Genes: Evidence for Sex in Giardia and 
an Early Eukaryotic Origin of Meiosis,” Current Biology 15 (2005):185-— 
91. 


Note: 
Link to Learning 
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Click through the steps of this interactive animation to compare the meiotic 
process of cell division to that of mitosis: How Cells Divide. 


Section Summary 


Sexual reproduction requires that diploid organisms produce haploid cells 
that can fuse during fertilization to form diploid offspring. As with mitosis, 
DNA replication occurs prior to meiosis during the S-phase of the cell 
cycle. Meiosis is a series of events that arrange and separate chromosomes 
and chromatids into daughter cells. During the interphases of meiosis, each 
chromosome is duplicated. In meiosis, there are two rounds of nuclear 
division resulting in four nuclei and usually four daughter cells, each with 
half the number of chromosomes as the parent cell. The first separates 
homologs, and the second—like mitosis—separates chromatids into 
individual chromosomes. During meiosis, variation in the daughter nuclei is 


introduced because of crossover in prophase I and random alignment of 
tetrads at metaphase I. The cells that are produced by meiosis are 
genetically unique. 


Meiosis and mitosis share similarities, but have distinct outcomes. Mitotic 
divisions are single nuclear divisions that produce daughter nuclei that are 
genetically identical and have the same number of chromosome sets as the 
original cell. Meiotic divisions include two nuclear divisions that produce 
four daughter nuclei that are genetically different and have one 
chromosome set instead of the two sets of chromosomes in the parent cell. 
The main differences between the processes occur in the first division of 
meiosis, in which homologous chromosomes are paired and exchange non- 
sister chromatid segments. The homologous chromosomes separate into 
different nuclei during meiosis I, causing a reduction of ploidy level in the 
first division. The second division of meiosis is more similar to a mitotic 
division, except that the daughter cells do not contain identical genomes 
because of crossover. 


Review Questions 


Exercise: 


Problem: Meiosis produces daughter cells. 


a. two haploid 
b. two diploid 
c. four haploid 
d. four diploid 


Solution: 


G 


Exercise: 


Problem: What structure is most important in forming the tetrads? 


a. centromere 

b. synaptonemal complex 
c. chiasma 

d. kinetochore 


Solution: 


B 
Exercise: 
Problem: 


At which stage of meiosis are sister chromatids separated from each 
other? 


a. prophase I 
b. prophase II 
c. anaphase I 
d. anaphase II 


Solution: 


D 
Exercise: 
Problem: 


At metaphase I, homologous chromosomes are connected only at what 
structures? 


a. chiasmata 

b. recombination nodules 
c. microtubules 

d. kinetochores 


Solution: 


A 


Exercise: 


Problem: Which of the following is not true in regard to crossover? 


a. Spindle microtubules guide the transfer of DNA across the 
synaptonemal complex. 

b. Non-sister chromatids exchange genetic material. 

c. Chiasmata are formed. 

d. Recombination nodules mark the crossover point. 


Solution: 


C 
Exercise: 


Problem: 


What phase of mitotic interphase is missing from meiotic interkinesis? 


a. Go phase 
b. G; phase 
c. S phase 

d. Gp phase 


Solution: 
C 
Exercise: 
Problem: The part of meiosis that is similar to mitosis is 
a. meiosis [| 


b. anaphase I 
c. meiosis II 


d. interkinesis 


Solution: 


C 
Exercise: 


Problem: 


If a muscle cell of a typical organism has 32 chromosomes, how many 
chromosomes will be in a gamete of that same organism? 
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Solution: 


B 


Free Response 


Exercise: 


Problem: Describe the process that results in the formation of a tetrad. 


Solution: 


During the meiotic interphase, each chromosome is duplicated. The 
sister chromatids that are formed during synthesis are held together at 
the centromere region by cohesin proteins. All chromosomes are 
attached to the nuclear envelope by their tips. As the cell enters 
prophase I, the nuclear envelope begins to fragment, and the proteins 
holding homologous chromosomes locate each other. The four sister 


chromatids align lengthwise, and a protein lattice called the 
synaptonemal complex is formed between them to bind them together. 
The synaptonemal complex facilitates crossover between non-sister 
chromatids, which is observed as chiasmata along the length of the 
chromosome. As prophase I progresses, the synaptonemal complex 
breaks down and the sister chromatids become free, except where they 
are attached by chiasmata. At this stage, the four chromatids are visible 
in each homologous pairing and are called a tetrad. 


Exercise: 


Problem: 


Explain how the random alignment of homologous chromosomes 
during metaphase I contributes to the variation in gametes produced by 
meiosis. 


Solution: 


Random alignment leads to new combinations of traits. The 
chromosomes that were originally inherited by the gamete-producing 
individual came equally from the egg and the sperm. In metaphase I, 
the duplicated copies of these maternal and paternal homologous 
chromosomes line up across the center of the cell. The orientation of 
each tetrad is random. There is an equal chance that the maternally 
derived chromosomes will be facing either pole. The same is true of 
the paternally derived chromosomes. The alignment should occur 
differently in almost every meiosis. As the homologous chromosomes 
are pulled apart in anaphase I, any combination of maternal and 
paternal chromosomes will move toward each pole. The gametes 
formed from these two groups of chromosomes will have a mixture of 
traits from the individual’s parents. Each gamete is unique. 


Exercise: 
Problem: 


What is the function of the fused kinetochore found on sister 
chromatids in prometaphase I? 


Solution: 


In metaphase I, the homologous chromosomes line up at the metaphase 
plate. In anaphase I, the homologous chromosomes are pulled apart 
and move to opposite poles. Sister chromatids are not separated until 
meiosis II. The fused kinetochore formed during meiosis I ensures that 
each spindle microtubule that binds to the tetrad will attach to both 
sister chromatids. 


Exercise: 


Problem: 


In a comparison of the stages of meiosis to the stages of mitosis, which 
stages are unique to meiosis and which stages have the same events in 
both meiosis and mitosis? 


Solution: 


All of the stages of meiosis I, except possibly telophase I, are unique 
because homologous chromosomes are separated, not sister 
chromatids. In some species, the chromosomes do not decondense and 
the nuclear envelopes do not form in telophase I. All of the stages of 
meiosis II have the same events as the stages of mitosis, with the 
possible exception of prophase II. In some species, the chromosomes 
are still condensed and there is no nuclear envelope. Other than this, 
all processes are the same. 


Glossary 


chiasmata 
(singular, chiasma) the structure that forms at the crossover points 
after genetic material is exchanged 


cohesin 
proteins that form a complex that seals sister chromatids together at 
their centromeres until anaphase II of meiosis 


crossover 
exchange of genetic material between non-sister chromatids resulting 
in chromosomes that incorporate genes from both parents of the 
organism 


fertilization 
union of two haploid cells from two individual organisms 


interkinesis 
(also, interphase II) brief period of rest between meiosis I and meiosis 
II 


meiosis 
a nuclear division process that results in four haploid cells 


meiosis I| 
first round of meiotic cell division; referred to as reduction division 
because the ploidy level is reduced from diploid to haploid 


meiosis I] 
second round of meiotic cell division following meiosis I; sister 
chromatids are separated into individual chromosomes, and the result 
is four unique haploid cells 


recombination nodules 
protein assemblies formed on the synaptonemal complex that mark the 
points of crossover events and mediate the multistep process of genetic 
recombination between non-sister chromatids 


reduction division 
nuclear division that produces daughter nuclei each having one-half as 
many chromosome sets as the parental nucleus; meiosis I is a reduction 
division 


somatic cell 
all the cells of a multicellular organism except the gametes or 
reproductive cells 


spore 
haploid cell that can produce a haploid multicellular organism or can 
fuse with another spore to form a diploid cell 


synapsis 
formation of a close association between homologous chromosomes 
during prophase I 


synaptonemal complex 
protein lattice that forms between homologous chromosomes during 
prophase I, supporting crossover 


tetrad 
two duplicated homologous chromosomes (four chromatids) bound 
together by chiasmata during prophase I 


Sexual Reproduction 
By the end of this section, you will be able to: 


e Explain that meiosis and sexual reproduction are evolved traits 

e Identify variation among offspring as a potential evolutionary 
advantage to sexual reproduction 

e Describe the three different life-cycle types among sexual multicellular 
organisms and their commonalities 


Sexual reproduction was an early evolutionary innovation after the 
appearance of eukaryotic cells. It appears to have been very successful 
because most eukaryotes are able to reproduce sexually, and in many 
animals, it is the only mode of reproduction. And yet, scientists recognize 
some real disadvantages to sexual reproduction. On the surface, creating 
offspring that are genetic clones of the parent appears to be a better system. 
If the parent organism is successfully occupying a habitat, offspring with 
the same traits would be similarly successful. There is also the obvious 
benefit to an organism that can produce offspring whenever circumstances 
are favorable by asexual budding, fragmentation, or asexual eggs. These 
methods of reproduction do not require another organism of the opposite 
sex. Indeed, some organisms that lead a solitary lifestyle have retained the 
ability to reproduce asexually. In addition, in asexual populations, every 
individual is capable of reproduction. In sexual populations, the males are 
not producing the offspring themselves, so in theory an asexual population 
could grow twice as fast. 


However, multicellular organisms that exclusively depend on asexual 
reproduction are exceedingly rare. Why is sexuality (and meiosis) so 
common? This is one of the important unanswered questions in biology and 
has been the focus of much research beginning in the latter half of the 
twentieth century. There are several possible explanations, one of which is 
that the variation that sexual reproduction creates among offspring is very 
important to the survival and reproduction of the population. Thus, on 
average, a sexually reproducing population will leave more descendants 
than an otherwise similar asexually reproducing population. The only 
source of variation in asexual organisms is mutation. This is the ultimate 
source of variation in sexual organisms, but in addition, those different 


mutations are continually reshuffled from one generation to the next when 
different parents combine their unique genomes and the genes are mixed 
into different combinations by crossovers during prophase I and random 
assortment at metaphase I. 


Note: 

Evolution Connection 

The Red Queen Hypothesis 

It is not in dispute that sexual reproduction provides evolutionary 
advantages to organisms that employ this mechanism to produce offspring. 
But why, even in the face of fairly stable conditions, does sexual 
reproduction persist when it is more difficult and costly for individual 
organisms? Variation is the outcome of sexual reproduction, but why are 
ongoing variations necessary? Enter the Red Queen hypothesis, first 
proposed by Leigh Van Valen in 1973.!{2metel The concept was named in 
reference to the Red Queen's race in Lewis Carroll's book, Through the 
Looking-Glass. 

Leigh Van Valen, “A New Evolutionary Law,” Evolutionary Theory 1 
(1973): 1-30 

All species co-evolve with other organisms; for example predators evolve 
with their prey, and parasites evolve with their hosts. Each tiny advantage 
gained by favorable variation gives a species an edge over close 
competitors, predators, parasites, or even prey. The only method that will 
allow a co-evolving species to maintain its own share of the resources is to 
also continually improve its fitness. As one species gains an advantage, 
this increases selection on the other species; they must also develop an 
advantage or they will be outcompeted. No single species progresses too 
far ahead because genetic variation among the progeny of sexual 
reproduction provides all species with a mechanism to improve rapidly. 
Species that cannot keep up become extinct. The Red Queen’s catchphrase 
was, “It takes all the running you can do to stay in the same place.” This is 
an apt description of co-evolution between competing species. 


Life Cycles of Sexually Reproducing Organisms 


Fertilization and meiosis alternate in sexual life cycles. What happens 
between these two events depends on the organism. The process of meiosis 
reduces the chromosome number by half. Fertilization, the joining of two 
haploid gametes, restores the diploid condition. There are three main 
categories of life cycles in multicellular organisms: diploid-dominant, in 
which the multicellular diploid stage is the most obvious life stage, such as 
with most animals including humans; haploid-dominant, in which the 
multicellular haploid stage is the most obvious life stage, such as with all 
fungi and some algae; and alternation of generations, in which the two 
stages are apparent to different degrees depending on the group, as with 
plants and some algae. 


Diploid-Dominant Life Cycle 


Nearly all animals employ a diploid-dominant life-cycle strategy in which 
the only haploid cells produced by the organism are the gametes. Early in 
the development of the embryo, specialized diploid cells, called germ cells, 
are produced within the gonads, such as the testes and ovaries. Germ cells 
are capable of mitosis to perpetuate the cell line and meiosis to produce 
gametes. Once the haploid gametes are formed, they lose the ability to 
divide again. There is no multicellular haploid life stage. Fertilization 
occurs with the fusion of two gametes, usually from different individuals, 
restoring the diploid state ({link]). 


In animals, sexually reproducing adults form haploid 
gametes from diploid germ cells. Fusion of the 
gametes gives rise to a fertilized egg cell, or zygote. 
The zygote will undergo multiple rounds of mitosis to 
produce a multicellular offspring. The germ cells are 
generated early in the development of the zygote. 


Haploid-Dominant Life Cycle 


Most fungi and algae employ a life-cycle type in which the “body” of the 
organism—the ecologically important part of the life cycle—is haploid. The 
haploid cells that make up the tissues of the dominant multicellular stage 
are formed by mitosis. During sexual reproduction, specialized haploid cells 
from two individuals, designated the (+) and (—) mating types, join to form 
a diploid zygote. The zygote immediately undergoes meiosis to form four 
haploid cells called spores. Although haploid like the “parents,” these 
spores contain a new genetic combination from two parents. The spores can 
remain dormant for various time periods. Eventually, when conditions are 


conducive, the spores form multicellular haploid structures by many rounds 
of mitosis ([link]). 


Note: 
Art Connection 
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Fungi, such as black bread mold (Rhizopus 
nigricans), have haploid-dominant life cycles. The 
haploid multicellular stage produces specialized 
haploid cells by mitosis that fuse to form a diploid 
zygote. The zygote undergoes meiosis to produce 
haploid spores. Each spore gives rise to a 
multicellular haploid organism by mitosis. (credit 
“zygomycota” micrograph: modification of work 
by “Fanaberka”/Wikimedia Commons) 


If a mutation occurs so that a fungus is no longer able to produce a minus 
mating type, will it still be able to reproduce? 


Alternation of Generations 


The third life-cycle type, employed by some algae and all plants, is a blend 
of the haploid-dominant and diploid-dominant extremes. Species with 
alternation of generations have both haploid and diploid multicellular 
organisms as part of their life cycle. The haploid multicellular plants are 
called gametophytes, because they produce gametes from specialized cells. 
Meiosis is not directly involved in the production of gametes in this case, 
because the organism that produces the gametes is already a haploid. 
Fertilization between the gametes forms a diploid zygote. The zygote will 
undergo many rounds of mitosis and give rise to a diploid multicellular 
plant called a sporophyte. Specialized cells of the sporophyte will undergo 
meiosis and produce haploid spores. The spores will subsequently develop 
into the gametophytes ((Link]). 
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Plants have a life cycle that alternates between a 
multicellular haploid organism and a multicellular 
diploid organism. In some plants, such as ferns, both 
the haploid and diploid plant stages are free-living. 
The diploid plant is called a sporophyte because it 
produces haploid spores by meiosis. The spores 
develop into multicellular, haploid plants called 
gametophytes because they produce gametes. The 


gametes of two individuals will fuse to form a diploid 
zygote that becomes the sporophyte. (credit “fern”: 
modification of work by Cory Zanker; credit 
“sporangia”: modification of work by "Obsidian 
Soul"/Wikimedia Commons; credit “gametophyte and 
sporophyte”: modification of work by 
“VImastra”/Wikimedia Commons) 


Although all plants utilize some version of the alternation of generations, 
the relative size of the sporophyte and the gametophyte and the relationship 
between them vary greatly. In plants such as moss, the gametophyte 
organism is the free-living plant, and the sporophyte is physically 
dependent on the gametophyte. In other plants, such as ferns, both the 
gametophyte and sporophyte plants are free-living; however, the sporophyte 
is much larger. In seed plants, such as magnolia trees and daisies, the 
gametophyte is composed of only a few cells and, in the case of the female 
gametophyte, is completely retained within the sporophyte. 


Sexual reproduction takes many forms in multicellular organisms. 
However, at some point in each type of life cycle, meiosis produces haploid 
cells that will fuse with the haploid cell of another organism. The 
mechanisms of variation—crossover, random assortment of homologous 
chromosomes, and random fertilization—are present in all versions of 
sexual reproduction. The fact that nearly every multicellular organism on 
Earth employs sexual reproduction is strong evidence for the benefits of 
producing offspring with unique gene combinations, though there are other 
possible benefits as well. 


Section Summary 


Nearly all eukaryotes undergo sexual reproduction. The variation 
introduced into the reproductive cells by meiosis appears to be one of the 
advantages of sexual reproduction that has made it so successful. Meiosis 
and fertilization alternate in sexual life cycles. The process of meiosis 
produces unique reproductive cells called gametes, which have half the 


number of chromosomes as the parent cell. Fertilization, the fusion of 
haploid gametes from two individuals, restores the diploid condition. Thus, 
sexually reproducing organisms alternate between haploid and diploid 
stages. However, the ways in which reproductive cells are produced and the 
timing between meiosis and fertilization vary greatly. There are three main 
categories of life cycles: diploid-dominant, demonstrated by most animals; 
haploid-dominant, demonstrated by all fungi and some algae; and the 
alternation of generations, demonstrated by plants and some algae. 


Art Connections 


Exercise: 


Problem: 


[link] If a mutation occurs so that a fungus is no longer able to produce 
a minus mating type, will it still be able to reproduce? 


Solution: 


[link] Yes, it will be able to reproduce asexually. 


Review Questions 


Exercise: 


Problem: 


What is a likely evolutionary advantage of sexual reproduction over 
asexual reproduction? 


a. Sexual reproduction involves fewer steps. 

b. There is a lower chance of using up the resources in a given 
environment. 

c. Sexual reproduction results in variation in the offspring. 

d. Sexual reproduction is more cost-effective. 


Solution: 


C 
Exercise: 


Problem: 


Which type of life cycle has both a haploid and diploid multicellular 
stage? 


a. asexual 

b. diploid-dominant 

c. haploid-dominant 

d. alternation of generations 


Solution: 


D 


Exercise: 


Problem: Fungi typically display which type of life cycle? 


a. diploid-dominant 

b. haploid-dominant 

c. alternation of generations 
d. asexual 


Solution: 


B 


Exercise: 


Problem: 


A diploid, multicellular life-cycle stage that gives rise to haploid cells 
by meiosis is called a 


a. sporophyte 
b. gametophyte 
c. spore 

d. gamete 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


List and briefly describe the three processes that lead to variation in 
offspring with the same parents. 


Solution: 


a. Crossover occurs in prophase I between non-sister homologous 
chromosomes. Segments of DNA are exchanged between maternally 
derived and paternally derived chromosomes, and new gene 
combinations are formed. b. Random alignment during metaphase I 
leads to gametes that have a mixture of maternal and paternal 
chromosomes. c. Fertilization is random, in that any two gametes can 
fuse. 


Exercise: 


Problem: 


Compare the three main types of life cycles in multicellular organisms 
and give an example of an organism that employs each. 


Solution: 


a. In the haploid-dominant life cycle, the multicellular stage is haploid. 
The diploid stage is a spore that undergoes meiosis to produce cells 
that will divide mitotically to produce new multicellular organisms. 
Fungi have a haploid-dominant life cycle. b. In the diploid-dominant 
life cycle, the most visible or largest multicellular stage is diploid. The 
haploid stage is usually reduced to a single cell type, such as a gamete 
or spore. Animals, such as humans, have a diploid-dominant life cycle. 
c. In the alternation of generations life cycle, there are both haploid 
and diploid multicellular stages, although the haploid stage may be 
completely retained by the diploid stage. Plants have a life cycle with 
alternation of generations. 


Glossary 


alternation of generations 
life-cycle type in which the diploid and haploid stages alternate 


diploid-dominant 
life-cycle type in which the multicellular diploid stage is prevalent 


haploid-dominant 
life-cycle type in which the multicellular haploid stage is prevalent 


gametophyte 
a multicellular haploid life-cycle stage that produces gametes 


germ cells 
specialized cell line that produces gametes, such as eggs or sperm 


life cycle 


the sequence of events in the development of an organism and the 
production of cells that produce offspring 


sporophyte 
a multicellular diploid life-cycle stage that produces haploid spores by 
meiosis 


Introduction 
class="introduction" 


Chromosomes are 
threadlike nuclear 
structures 
consisting of DNA 
and proteins that 
serve as the 
repositories for 
genetic 
information. The 
chromosomes 
depicted here were 
isolated from a 
fruit fly’s salivary 
gland, stained with 
dye, and visualized 
under a 
microscope. Akin 
to miniature bar 
codes, 
chromosomes 
absorb different 
dyes to produce 
characteristic 
banding patterns, 
which allows for 
their routine 
identification. 
(credit: 
modification of 
work by 
“LPLT”/Wikimedi 
a Commons; scale- 
bar data from Matt 
Russell) 
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The gene is the physical unit of inheritance, and genes are arranged in a 
linear order on chromosomes. The behaviors and interactions of 
chromosomes during meiosis explain, at a cellular level, the patterns of 
inheritance that we observe in populations. Genetic disorders involving 
alterations in chromosome number or structure may have dramatic effects 
and can prevent a fertilized egg from developing altogether. 


Chromosomal Theory and Genetic Linkage 
By the end of this section, you will be able to: 


e Discuss Sutton’s Chromosomal Theory of Inheritance 

e Describe genetic linkage 

e Explain the process of homologous recombination, or crossing over 

e Describe how chromosome maps are created 

e Calculate the distances between three genes on a chromosome using a 
three-point test cross 


Long before chromosomes were visualized under a microscope, the father 
of modern genetics, Gregor Mendel, began studying heredity in 1843. With 
the improvement of microscopic techniques during the late 1800s, cell 
biologists could stain and visualize subcellular structures with dyes and 
observe their actions during cell division and meiosis. With each mitotic 
division, chromosomes replicated, condensed from an amorphous (no 
constant shape) nuclear mass into distinct X-shaped bodies (pairs of 
identical sister chromatids), and migrated to separate cellular poles. 


Chromosomal Theory of Inheritance 


The speculation that chromosomes might be the key to understanding 
heredity led several scientists to examine Mendel’s publications and re- 
evaluate his model in terms of the behavior of chromosomes during mitosis 
and meiosis. In 1902, Theodor Boveri observed that proper embryonic 
development of sea urchins does not occur unless chromosomes are present. 
That same year, Walter Sutton observed the separation of chromosomes into 
daughter cells during meiosis ({link]). Together, these observations led to 
the development of the Chromosomal Theory of Inheritance, which 
identified chromosomes as the genetic material responsible for Mendelian 
inheritance. 


(a) (b) 


(a) Walter Sutton and (b) 
Theodor Boveri are credited 
with developing the 
Chromosomal Theory of 
Inheritance, which states that 
chromosomes carry the unit 
of heredity (genes). 


The Chromosomal Theory of Inheritance was consistent with Mendel’s 
laws and was supported by the following observations: 


e During meiosis, homologous chromosome pairs migrate as discrete 
structures that are independent of other chromosome pairs. 

e The sorting of chromosomes from each homologous pair into pre- 
gametes appears to be random. 

e Each parent synthesizes gametes that contain only half of their 
chromosomal complement. 

e Even though male and female gametes (sperm and egg) differ in size 
and morphology, they have the same number of chromosomes, 
suggesting equal genetic contributions from each parent. 

e The gametic chromosomes combine during fertilization to produce 
offspring with the same chromosome number as their parents. 


Despite compelling correlations between the behavior of chromosomes 
during meiosis and Mendel’s abstract laws, the Chromosomal Theory of 


Inheritance was proposed long before there was any direct evidence that 
traits were carried on chromosomes. Critics pointed out that individuals had 
far more independently segregating traits than they had chromosomes. It 
was only after several years of carrying out crosses with the fruit fly, 
Drosophila melanogaster, that Thomas Hunt Morgan provided 
experimental evidence to support the Chromosomal Theory of Inheritance. 


Genetic Linkage and Distances 


Mendel’s work suggested that traits are inherited independently of each 
other. Morgan identified a 1:1 correspondence between a segregating trait 
and the X chromosome, suggesting that the random segregation of 
chromosomes was the physical basis of Mendel’s model. This also 
demonstrated that linked genes disrupt Mendel’s predicted outcomes. The 
fact that each chromosome can carry many linked genes explains how 
individuals can have many more traits than they have chromosomes. 
However, observations by researchers in Morgan’s laboratory suggested 
that alleles positioned on the same chromosome were not always inherited 
together. During meiosis, linked genes somehow became unlinked. 


Homologous Recombination 


In 1909, Frans Janssen observed chiasmata—the point at which chromatids 
are in contact with each other and may exchange segments—prior to the 
first division of meiosis. He suggested that alleles become unlinked and 
chromosomes physically exchange segments. As chromosomes condensed 
and paired with their homologs, they appeared to interact at distinct points. 
Janssen suggested that these points corresponded to regions in which 
chromosome segments were exchanged. It is now known that the pairing 
and interaction between homologous chromosomes, known as synapsis, 
does more than simply organize the homologs for migration to separate 
daughter cells. When synapsed, homologous chromosomes undergo 
reciprocal physical exchanges at their arms in a process called homologous 
recombination, or more simply, “crossing over.” 


To better understand the type of experimental results that researchers were 
obtaining at this time, consider a heterozygous individual that inherited 
dominant maternal alleles for two genes on the same chromosome (such as 
AB) and two recessive paternal alleles for those same genes (such as ab). If 
the genes are linked, one would expect this individual to produce gametes 
that are either AB or ab with a 1:1 ratio. If the genes are unlinked, the 
individual should produce AB, Ab, aB, and ab gametes with equal 
frequencies, according to the Mendelian concept of independent assortment. 
Because they correspond to new allele combinations, the genotypes Ab and 
aB are nonparental types that result from homologous recombination 
during meiosis. Parental types are progeny that exhibit the same allelic 
combination as their parents. Morgan and his colleagues, however, found 
that when such heterozygous individuals were test crossed to a homozygous 
recessive parent (AaBb x aabb), both parental and nonparental cases 
occurred. For example, 950 offspring might be recovered that were either 
AaBb or aabb, but 50 offspring would also be obtained that were either 
Aabb or aaBb. These results suggested that linkage occurred most often, but 
a significant minority of offspring were the products of recombination. 


Note: 
Art Connection 
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Inheritance patterns of unlinked and 
linked genes are shown. In (a), two 
genes are located on different 
chromosomes so independent assortment 
occurs during meiosis. The offspring 
have an equal chance of being the 
parental type (inheriting the same 
combination of traits as the parents) or a 
nonparental type (inheriting a different 
combination of traits than the parents). 
In (b), two genes are very close together 
on the same chromosome so that no 
crossing over occurs between them. The 
genes are therefore always inherited 
together and all of the offspring are the 


parental type. In (c), two genes are far 
apart on the chromosome such that 
crossing over occurs during every 
meiotic event. The recombination 
frequency will be the same as if the 
genes were on separate chromosomes. 
(d) The actual recombination frequency 
of fruit fly wing length and body color 
that Thomas Morgan observed in 1912 
was 17 percent. A crossover frequency 
between 0 percent and 50 percent 
indicates that the genes are on the same 
chromosome and crossover occurs some 
of the time. 


In a test cross for two characteristics such as the one shown here, can the 
predicted frequency of recombinant offspring be 60 percent? Why or why 
not? 


Genetic Maps 


Janssen did not have the technology to demonstrate crossing over so it 
remained an abstract idea that was not widely accepted. Scientists thought 
chiasmata were a variation on synapsis and could not understand how 
chromosomes could break and rejoin. Yet, the data were clear that linkage 
did not always occur. Ultimately, it took a young undergraduate student and 
an “all-nighter” to mathematically elucidate the problem of linkage and 
recombination. 


In 1913, Alfred Sturtevant, a student in Morgan’s laboratory, gathered 
results from researchers in the laboratory, and took them home one night to 
mull them over. By the next morning, he had created the first “chromosome 
map,” a linear representation of gene order and relative distance on a 
chromosome ((Link]). 


Note: 
Art Connection 


Genetic Map Based on Recombination 
Frequencies in Drosophila 


Short aristae Long aristae 
Black body H | Gray body 
Cinnabar eyes 4 Red eyes 


Vestigial wings Normal wings 


Brown eyes Red eyes 


Values in centimorgan (cM) map units; recombination 
frequency of 0.01 = 1 cM 


This genetic map 
orders Drosophila 
genes on the basis 
of recombination 
frequency. 


Which of the following statements is true? 


a. Recombination of the body color and red/cinnabar eye alleles will 
occur more frequently than recombination of the alleles for wing 
length and aristae length. 

b. Recombination of the body color and aristae length alleles will occur 
more frequently than recombination of red/brown eye alleles and the 
aristae length alleles. 

c. Recombination of the gray/black body color and long/short aristae 
alleles will not occur. 

d. Recombination of the red/brown eye and long/short aristae alleles will 
occur more frequently than recombination of the alleles for wing 
length and body color. 


As shown in [link], by using recombination frequency to predict genetic 
distance, the relative order of genes on chromosome 2 could be inferred. 
The values shown represent map distances in centimorgans (cM), which 
correspond to recombination frequencies (in percent). Therefore, the genes 
for body color and wing size were 65.5 — 48.5 = 17 cM apart, indicating 
that the maternal and paternal alleles for these genes recombine in 17 
percent of offspring, on average. 


To construct a chromosome map, Sturtevant assumed that genes were 
ordered serially on threadlike chromosomes. He also assumed that the 
incidence of recombination between two homologous chromosomes could 
occur with equal likelihood anywhere along the length of the chromosome. 
Operating under these assumptions, Sturtevant postulated that alleles that 
were far apart on a chromosome were more likely to dissociate during 
meiosis simply because there was a larger region over which recombination 
could occur. Conversely, alleles that were close to each other on the 
chromosome were likely to be inherited together. The average number of 
crossovers between two alleles—that is, their recombination frequency— 
correlated with their genetic distance from each other, relative to the 
locations of other genes on that chromosome. Considering the example 
cross between AaBb and aabb above, the frequency of recombination could 
be calculated as 50/1000 = 0.05. That is, the likelihood of a crossover 
between genes A/a and B/b was 0.05, or 5 percent. Such a result would 
indicate that the genes were definitively linked, but that they were far 
enough apart for crossovers to occasionally occur. Sturtevant divided his 
genetic map into map units, or centimorgans (cM), in which a 
recombination frequency of 0.01 corresponds to 1 cM. 


By representing alleles in a linear map, Sturtevant suggested that genes can 
range from being perfectly linked (recombination frequency = 0) to being 
perfectly unlinked (recombination frequency = 0.5) when genes are on 
different chromosomes or genes are separated very far apart on the same 
chromosome. Perfectly unlinked genes correspond to the frequencies 
predicted by Mendel to assort independently in a dihybrid cross. A 
recombination frequency of 0.5 indicates that 50 percent of offspring are 
recombinants and the other 50 percent are parental types. That is, every 
type of allele combination is represented with equal frequency. This 


representation allowed Sturtevant to additively calculate distances between 
several genes on the same chromosome. However, as the genetic distances 
approached 0.50, his predictions became less accurate because it was not 
clear whether the genes were very far apart on the same chromosome or on 
different chromosomes. 


In 1931, Barbara McClintock and Harriet Creighton demonstrated the 
crossover of homologous chromosomes in corn plants. Weeks later, 
homologous recombination in Drosophila was demonstrated 
microscopically by Curt Stern. Stern observed several X-linked phenotypes 
that were associated with a structurally unusual and dissimilar X 
chromosome pair in which one X was missing a small terminal segment, 
and the other X was fused to a piece of the Y chromosome. By crossing 
flies, observing their offspring, and then visualizing the offspring’s 
chromosomes, Stern demonstrated that every time the offspring allele 
combination deviated from either of the parental combinations, there was a 
corresponding exchange of an X chromosome segment. Using mutant flies 
with structurally distinct X chromosomes was the key to observing the 
products of recombination because DNA sequencing and other molecular 
tools were not yet available. It is now known that homologous 
chromosomes regularly exchange segments in meiosis by reciprocally 
breaking and rejoining their DNA at precise locations. 


Note: 
Link to Learning 
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Review Sturtevant’s process to create a genetic map on the basis of 
recombination frequencies here. 


Mendel’s Mapped Traits 


Homologous recombination is a common genetic process, yet Mendel never 
observed it. Had he investigated both linked and unlinked genes, it would 
have been much more difficult for him to create a unified model of his data 
on the basis of probabilistic calculations. Researchers who have since 
mapped the seven traits investigated by Mendel onto the seven 
chromosomes of the pea plant genome have confirmed that all of the genes 
he examined are either on separate chromosomes or are sufficiently far 
apart as to be statistically unlinked. Some have suggested that Mendel was 
enormously lucky to select only unlinked genes, whereas others question 
whether Mendel discarded any data suggesting linkage. In any case, Mendel 
consistently observed independent assortment because he examined genes 
that were effectively unlinked. 


Section Summary 


The Chromosomal Theory of inheritance, proposed by Sutton and Boveri, 
states that chromosomes are the vehicles of genetic heredity. Neither 
Mendelian genetics nor gene linkage is perfectly accurate; instead, 
chromosome behavior involves segregation, independent assortment, and 
occasionally, linkage. Sturtevant devised a method to assess recombination 
frequency and infer the relative positions and distances of linked genes on a 
chromosome on the basis of the average number of crossovers in the 
intervening region between the genes. Sturtevant correctly presumed that 
genes are arranged in Serial order on chromosomes and that recombination 
between homologs can occur anywhere on a chromosome with equal 
likelihood. Whereas linkage causes alleles on the same chromosome to be 
inherited together, homologous recombination biases alleles toward an 
inheritance pattern of independent assortment. 


Art Connections 


Exercise: 


Problem: 


[link] In a test cross for two characteristics such as the one shown here, 
can the predicted frequency of recombinant offspring be 60 percent? 
Why or why not? 


Solution: 


[link] No. The predicted frequency of recombinant offspring ranges 
from 0% (for linked traits) to 50% (for unlinked traits). 


Exercise: 


Problem: [link] Which of the following statements is true? 


a. Recombination of the body color and red/cinnabar eye alleles will 
occur more frequently than recombination of the alleles for wing 
length and aristae length. 

b. Recombination of the body color and aristae length alleles will 
occur more frequently than recombination of red/brown eye 
alleles and the aristae length alleles. 

c. Recombination of the gray/black body color and long/short 
aristae alleles will not occur. 

d. Recombination of the red/brown eye and long/short aristae alleles 
will occur more frequently than recombination of the alleles for 
wing length and body color. 


Solution: 


[link] D 


Review Questions 


Exercise: 


Problem: 


X-linked recessive traits in humans (or in Drosophila) are observed 


a. in more males than females 

b. in more females than males 

c. in males and females equally 

d. in different distributions depending on the trait 


Solution: 


A 
Exercise: 


Problem: 


The first suggestion that chromosomes may physically exchange 
segments came from the microscopic identification of 


a. synapsis 

b. sister chromatids 
c. chiasmata 

d. alleles 


Solution: 


C 
Exercise: 


Problem: 


Which recombination frequency corresponds to independent 
assortment and the absence of linkage? 


a. O 


bh: 0.25 
c. 0.50 
d. 0.75 


Solution: 


‘a 
Exercise: 
Problem: 
Which recombination frequency corresponds to perfect linkage and 
violates the law of independent assortment? 
a. 0 
b. 0.25 
0,50 
d. 0.75 
Solution: 


A 


Free Response 


Exercise: 


Problem: 


Explain how the Chromosomal Theory of Inheritance helped to 
advance our understanding of genetics. 


Solution: 


The Chromosomal Theory of Inheritance proposed that genes reside on 
chromosomes. The understanding that chromosomes are linear arrays 


of genes explained linkage, and crossing over explained 
recombination. 


Glossary 


centimorgan (cM) 
(also, map unit) relative distance that corresponds to a recombination 
frequency of 0.01 


Chromosomal Theory of Inheritance 
theory proposing that chromosomes are the vehicles of genes and that 
their behavior during meiosis is the physical basis of the inheritance 
patterns that Mendel observed 


homologous recombination 
process by which homologous chromosomes undergo reciprocal 
physical exchanges at their arms, also known as crossing over 


nonparental (recombinant) type 
progeny resulting from homologous recombination that exhibits a 
different allele combination compared with its parents 


parental types 
progeny that exhibits the same allelic combination as its parents 


recombination frequency 
average number of crossovers between two alleles; observed as the 
number of nonparental types in a population of progeny 


Null Hypothesis 
This experimental only 


The Chi-Squared Test : This was copied From Nature’s 
Scitable 


THIS IS THE SOURCE 


http://www.nature.com/scitable/buildbook/preview/open-genetics- 
129407306/129407486#headerAndCitation 


Forming and Testing a Hypothesis 


This needs to The first thing any scientist does before performing an 
experiment is to form a hypothesis about the experiment's outcome. This 
often takes the form of a null hypothesis, which is a statistical hypothesis 
that provides the expected values for an experiment. The null hypothesis is 
proposed by a scientist before completing an experiment, and it can be 
supported by data or disproved in favor of an alternate hypothesis. 


Let's consider some examples of the use of the null hypothesis in a genetics 
experiment. Remember that Mendelian inheritance deals with traits that 
show discontinuous variation, which means that the phenotypes fall into 
distinct categories. As a consequence, in a Mendelian genetic cross, the null 
hypothesis is usually an extrinsic hypothesis; in other words, the expected 
proportions can be predicted and calculated before the experiment starts. 
Then an experiment can be designed to determine whether the data confirm 
or reject the hypothesis. On the other hand, in another experiment, you 
might hypothesize that two genes are linked. This is called an intrinsic 
hypothesis, which is a hypothesis in which the expected proportions are 
calculated after the experiment is done using some information from the 
experimental data (McDonald, 2008). 


How Math Merged with Biology 


But how did mathematics and genetics come to be linked through the use of 
hypotheses and statistical analysis? The key figure in this process was Karl 
Pearson, a turn-of-the-century mathematician who was fascinated with 
biology. When asked what his first memory was, Pearson responded by 
saying, "Well, I do not know how old I was, but I was sitting in a high chair 
and I was sucking my thumb. Someone told me to stop sucking it and said 
that if I did so, the thumb would wither away. I put my two thumbs together 
and looked at them a long time. “They look alike to me,' I said to myself, ‘I 
can't see that the thumb I suck is any smaller than the other. I wonder if she 
could be lying to me™ (Walker, 1958). As this anecdote illustrates, Pearson 
was perhaps born to be a scientist. He was a sharp observer and intent on 
interpreting his own data. During his career, Pearson developed statistical 
theories and applied them to the exploration of biological data. His 
innovations were not well received, however, and he faced an arduous 
struggle in convincing other scientists to accept the idea that mathematics 
should be applied to biology. For instance, during Pearson's time, the Royal 
Society, which is the United Kingdom's academy of science, would accept 
papers that concerned either mathematics or biology, but it refused to 
accept papers than concerned both subjects (Walker, 1958). In response, 
Pearson, along with Francis Galton and W. F. R. Weldon, founded a new 
journal called Biometrika in 1901 to promote the statistical analysis of data 
on heredity. Pearson's persistence paid off. Today, statistical tests are 
essential for examining biological data. 


Pearson's Chi-Square Test for Goodness-of-Fit 


One of Pearson's most significant achievements occurred in 1900, when he 
developed a statistical test called Pearson's chi-square (X*) test, also known 
as the chi-square test for goodness-of-fit (Pearson, 1900). Pearson's chi- 
square test is used to examine the role of chance in producing deviations 
between observed and expected values. The test depends on an extrinsic 
hypothesis, because it requires theoretical expected values to be calculated. 
The test indicates the probability that chance alone produced the deviation 
between the expected and the observed values (Pierce, 2005). When the 


probability calculated from Pearson's chi-square test is high, it is assumed 
that chance alone produced the difference. Conversely, when the probability 
is low, it is assumed that a significant factor other than chance produced the 
deviation. 


In 1912, J. Arthur Harris applied Pearson's chi-square test to examine 
Mendelian ratios (Harris, 1912). It is important to note that when Gregor 
Mendel! studied inheritance, he did not use statistics, and neither did 
Bateson, Saunders, Punnett, and Morgan during their experiments that 
discovered genetic linkage. Thus, until Pearson's statistical tests were 
applied to biological data, scientists judged thegoodness of fit between 
theoretical and observed experimental results simply by inspecting the data 
and drawing conclusions (Harris, 1912). Although this method can work 
perfectly if one's data exactly matches one's predictions, scientific 
experiments often have variability associated with them, and this makes 
Statistical tests very useful. 


The chi-square value is calculated using the following formula: 


Using this formula, the difference between the observed and expected 
frequencies is calculated for each experimental outcome category. The 
difference is then squared and divided by the expectedfrequency. Finally, 
the chi-square values for each outcome are summed together, as represented 
by the summation sign (2). 


Pearson's chi-square test works well with genetic data as long as there are 
enough expected values in each group. In the case of small samples (less 
than 10 in any category) that have 1 degree of freedom, the test is not 
reliable. (Degrees of freedom, or df, will be explained in full later in this 
article.) However, in such cases, the test can be corrected by using the Yates 
correction for continuity, which reduces the absolute value of each 
difference between observed and expected frequencies by 0.5 before 
squaring. Additionally, it is important to remember that the chi-square test 
can only be applied to numbers ofprogeny, not to proportions or 
percentages. 


Now that you know the rules for using the test, it's time to consider an 
example of how to calculate Pearson's chi-square. Recall that when Mendel 


crossed his pea plants, he learned that tall (T) wasdominant to short (t). You 
want to confirm that this is correct, so you start by formulating the 
following null hypothesis: In a cross between two heterozygote (Tt) plants, 
the offspring should occur in a 3:1 ratio of tall plants to short plants. Next, 
you cross the plants, and after the cross, you measure the 


characteristics of 400 offspring. You note that there are 305 tall pea plants 
and 95 short pea plants; these are your observed values. Meanwhile, 

you expect that there will be 300 tall plants and 100 short plants from the 
Mendelian ratio. 


You are now ready to perform statistical analysis of your results, but first, 
you have to choose a critical value at which to reject your null hypothesis. 
You opt for a critical value probability of 0.01 (1%) that the deviation 
between the observed and expected values is due to chance. This means that 
if the probability is less than 0.01, then the deviation is significant and not 
due to chance, and you will reject your null hypothesis. However, if the 
deviation is greater than 0.01, then the deviation is not significant and you 
will not reject the null hypothesis. 


So, should you reject your null hypothesis or not? Here's a summary of your 
observed and expected data: 


Tall Short 
Expected 300 100 
Observed 305 95 


Now, let's calculate Pearson's chi-square: 


For tall plants: X? = (305 - 300)*/ 300 = 0.08 


For short plants: X? = (95 - 100)*/ 100 = 0.25 
The sum of the two categories is 0.08 + 0.25 = 0.33 
Therefore, the overall Pearson's chi-square for the experiment is X* = 0.33 


Next, you determine the probability that is associated with your calculated 
chi-square value. To do this, you compare your calculated chi-square value 
with theoretical values in a chi-square table that has the same number of 
degrees of freedom. Degrees of freedom represent the number of ways in 
which the observed outcome categories are free to vary. For Pearson's chi- 
square test, the degrees of freedom are equal to n - 1, where n represents the 
number of different expected phenotypes (Pierce, 2005). In your 
experiment, there are two expected outcome phenotypes (tall and short), 
so n= 2 categories, and the degrees of freedom equal 2 - 1 = 1. Thus, with 
your calculated chi-square value (0.33) and the associated degrees of 
freedom (1), you can determine the probability by using a chi-square table 
(Table 1). 


(Table adapted from Jones, 2008) 


Note that the chi-square table is organized with degrees of freedom (df) in 
the left column and probabilities (P) at the top. The chi-square values 
associated with the probabilities are in the center of the table. To determine 
the probability, first locate the row for the degrees of freedom for your 
experiment, then determine where the calculated chi-square value would be 
placed among the theoretical values in the corresponding row. 


At the beginning of your experiment, you decided that if the probability was 
less than 0.01, you would reject your null hypothesis because the deviation 
would be significant 


and not due to chance. Now, looking at the row that corresponds to 1 degree 
of freedom, you see that your calculated chi-square value of 0.33 falls 
between 0.016, which is associated with a probability of 0.9, and 2.706, 
which is associated with a probability of 0.10. Therefore, there is between a 
10% and 90% probability that the deviation you observed between your 
expected and the observed numbers of tall and short plants is due to chance. 


In other words, the probability associated with your chi-square value is 
much greater than the critical value of 0.01. This means that we will reject 
our null hypothesis, and the deviation between the observed and expected 
results is not significant. 


Level of Significance 


Determining whether to accept or reject a hypothesis is decided by the 
experimenter, who is the person who chooses the "level of significance" or 
confidence. Scientists commonly use the 0.05, 0.01, or 0.001 probability 
levels as cut-off values. For instance, in the example experiment, you used 
the 0.01 probability. Thus, P => 0.01, which can be interpreted to mean that 
chance likely caused the deviation between the observed and the expected 
values (i.e. there is a greater than 1% probability that chance explains the 
data). If instead we had observed that P < 0.01, this would mean that there 
is less than a 1% probability that our data can be explained by chance. 
There is a significant difference between our expected and observed results, 
so the deviation must be caused by something other than chance. 
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reasonably supposed to have arisen from random How Math Merged with 
Biology 


Chromosomal Basis of Inherited Disorders 
By the end of this section, you will be able to: 


e Describe how a karyogram is created 

e Explain how nondisjunction leads to disorders in chromosome number 

e Compare disorders caused by aneuploidy 

e Describe how errors in chromosome structure occur through inversions 
and translocations 


Inherited disorders can arise when chromosomes behave abnormally during 
meiosis. Chromosome disorders can be divided into two categories: 
abnormalities in chromosome number and chromosomal structural 
rearrangements. Because even small segments of chromosomes can span 
many genes, chromosomal disorders are characteristically dramatic and 
often fatal. 


Identification of Chromosomes 


The isolation and microscopic observation of chromosomes forms the basis 
of cytogenetics and is the primary method by which clinicians detect 
chromosomal abnormalities in humans. A karyotype is the number and 
appearance of chromosomes, and includes their length, banding pattern, and 
centromere position. To obtain a view of an individual’s karyotype, 
cytologists photograph the chromosomes and then cut and paste each 
chromosome into a chart, or karyogram, also known as an ideogram 
((link]). 


This karyotype is of a female human. Notice that homologous 
chromosomes are the same size, and have the same centromere 
positions and banding patterns. A human male would have an XY 
chromosome pair instead of the XX pair shown. (credit: Andreas 
Blozer et al) 


In a given species, chromosomes can be identified by their number, size, 
centromere position, and banding pattern. In a human karyotype, 
autosomes or “body chromosomes” (all of the non—sex chromosomes) are 
generally organized in approximate order of size from largest (chromosome 
1) to smallest (chromosome 22). The X and Y chromosomes are not 
autosomes. However, chromosome 21 is actually shorter than chromosome 
22. This was discovered after the naming of Down syndrome as trisomy 21, 
reflecting how this disease results from possessing one extra chromosome 
21 (three total). Not wanting to change the name of this important disease, 
chromosome 21 retained its numbering, despite describing the shortest set 
of chromosomes. The chromosome “arms” projecting from either end of the 
centromere may be designated as short or long, depending on their relative 
lengths. The short arm is abbreviated p (for “petite”), whereas the long arm 
is abbreviated q (because it follows “p” alphabetically). Each arm is further 
subdivided and denoted by a number. Using this naming system, locations 
on chromosomes can be described consistently in the scientific literature. 


Note: 

Career Connection 

Geneticists Use Karyograms to Identify Chromosomal Aberrations 
Although Mendel is referred to as the “father of modern genetics,” he 
performed his experiments with none of the tools that the geneticists of 
today routinely employ. One such powerful cytological technique is 
karyotyping, a method in which traits characterized by chromosomal 
abnormalities can be identified from a single cell. To observe an 
individual’s karyotype, a person’s cells (like white blood cells) are first 
collected from a blood sample or other tissue. In the laboratory, the isolated 


cells are stimulated to begin actively dividing. A chemical called 
colchicine is then applied to cells to arrest condensed chromosomes in 
metaphase. Cells are then made to swell using a hypotonic solution so the 
chromosomes spread apart. Finally, the sample is preserved in a fixative 
and applied to a slide. 

The geneticist then stains chromosomes with one of several dyes to better 
visualize the distinct and reproducible banding patterns of each 
chromosome pair. Following staining, the chromosomes are viewed using 
bright-field microscopy. A common stain choice is the Giemsa stain. 
Giemsa staining results in approximately 400-800 bands (of tightly coiled 
DNA and condensed proteins) arranged along all of the 23 chromosome 
pairs; an experienced geneticist can identify each band. In addition to the 
banding patterns, chromosomes are further identified on the basis of size 
and centromere location. To obtain the classic depiction of the karyotype in 
which homologous pairs of chromosomes are aligned in numerical order 
from longest to shortest, the geneticist obtains a digital image, identifies 
each chromosome, and manually arranges the chromosomes into this 
pattern ((link]). 

At its most basic, the karyogram may reveal genetic abnormalities in which 
an individual has too many or too few chromosomes per cell. Examples of 
this are Down Syndrome, which is identified by a third copy of 
chromosome 21, and Turner Syndrome, which is characterized by the 
presence of only one X chromosome in women instead of the normal two. 
Geneticists can also identify large deletions or insertions of DNA. For 
instance, Jacobsen Syndrome—which involves distinctive facial features as 
well as heart and bleeding defects—is identified by a deletion on 
chromosome 11. Finally, the karyotype can pinpoint translocations, which 
occur when a segment of genetic material breaks from one chromosome 
and reattaches to another chromosome or to a different part of the same 
chromosome. Translocations are implicated in certain cancers, including 
chronic myelogenous leukemia. 

During Mendel’s lifetime, inheritance was an abstract concept that could 
only be inferred by performing crosses and observing the traits expressed 
by offspring. By observing a karyogram, today’s geneticists can actually 
visualize the chromosomal composition of an individual to confirm or 
predict genetic abnormalities in offspring, even before birth. 


Disorders in Chromosome Number 


Of all of the chromosomal disorders, abnormalities in chromosome number 
are the most obviously identifiable from a karyogram. Disorders of 
chromosome number include the duplication or loss of entire chromosomes, 
as well as changes in the number of complete sets of chromosomes. They 
are caused by nondisjunction, which occurs when pairs of homologous 
chromosomes or sister chromatids fail to separate during meiosis. 
Misaligned or incomplete synapsis, or a dysfunction of the spindle 
apparatus that facilitates chromosome migration, can cause nondisjunction. 
The risk of nondisjunction occurring increases with the age of the parents. 


Nondisjunction can occur during either meiosis I or I, with differing results 
({link]). If homologous chromosomes fail to separate during meiosis I, the 
result is two gametes that lack that particular chromosome and two gametes 
with two copies of the chromosome. If sister chromatids fail to separate 
during meiosis II, the result is one gamete that lacks that chromosome, two 
normal gametes with one copy of the chromosome, and one gamete with 
two copies of the chromosome. 


Note: 
Art Connection 


Nondisjunction 


Meiosis | Meiosis II 


nondisjunction 
during meiosis | 


nondisjunction 
during meiosis II 


Nondisjunction occurs when 
homologous chromosomes or sister 
chromatids fail to separate during 
meiosis, resulting in an abnormal 
chromosome number. Nondisjunction 
may occur during meiosis I or meiosis 
Il. 


Which of the following statements about nondisjunction is true? 


a. Nondisjunction only results in gametes with n+1 or n—1 
chromosomes. 

b. Nondisjunction occurring during meiosis IT results in 50 percent 
normal gametes. 

c. Nondisjunction during meiosis I results in 50 percent normal gametes. 

d. Nondisjunction always results in four different kinds of gametes. 


Aneuploidy 


An individual with the appropriate number of chromosomes for their 
species is called euploid; in humans, euploidy corresponds to 22 pairs of 
autosomes and one pair of sex chromosomes. An individual with an error in 
chromosome number is described as aneuploid, a term that includes 
monosomy (loss of one chromosome) or trisomy (gain of an extraneous 
chromosome). Monosomic human zygotes missing any one copy of an 
autosome invariably fail to develop to birth because they lack essential 
genes. This underscores the importance of “gene dosage” in humans. Most 
autosomal trisomies also fail to develop to birth; however, duplications of 
some of the smaller chromosomes (13, 15, 18, 21, or 22) can result in 
offspring that survive for several weeks to many years. Trisomic individuals 
suffer from a different type of genetic imbalance: an excess in gene dose. 
Individuals with an extra chromosome may synthesize an abundance of the 
gene products encoded by that chromosome. This extra dose (150 percent) 
of specific genes can lead to a number of functional challenges and often 
precludes development. The most common trisomy among viable births is 
that of chromosome 21, which corresponds to Down Syndrome. Individuals 
with this inherited disorder are characterized by short stature and stunted 
digits, facial distinctions that include a broad skull and large tongue, and 
significant developmental delays. The incidence of Down syndrome is 
correlated with maternal age; older women are more likely to become 
pregnant with fetuses carrying the trisomy 21 genotype ([link]). 
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Data source: American Family Physician; Aug 15, 2000 


The incidence of having a fetus with 
trisomy 21 increases dramatically with 
maternal age. 


Note: 
Link to Learning 


Fs 
mess Openstax COLLEGE 


Visualize the addition of a chromosome that leads to Down syndrome in 
this video simulation. 


Polyploidy 


An individual with more than the correct number of chromosome sets (two 
for diploid species) is called polyploid. For instance, fertilization of an 
abnormal diploid egg with a normal haploid sperm would yield a triploid 
zygote. Polyploid animals are extremely rare, with only a few examples 
among the flatworms, crustaceans, amphibians, fish, and lizards. Polyploid 
animals are sterile because meiosis cannot proceed normally and instead 
produces mostly aneuploid daughter cells that cannot yield viable zygotes. 
Rarely, polyploid animals can reproduce asexually by haplodiploidy, in 
which an unfertilized egg divides mitotically to produce offspring. In 
contrast, polyploidy is very common in the plant kingdom, and polyploid 
plants tend to be larger and more robust than euploids of their species 
({link]). 


As with many polyploid plants, 
this triploid orange daylily 
(Hemerocallis fulva) is 
particularly large and robust, and 
grows flowers with triple the 
number of petals of its diploid 
counterparts. (credit: Steve Karg) 


Sex Chromosome Nondisjunction in Humans 


Humans display dramatic deleterious effects with autosomal trisomies and 
monosomies. Therefore, it may seem counterintuitive that human females 
and males can function normally, despite carrying different numbers of the 
X chromosome. Rather than a gain or loss of autosomes, variations in the 
number of sex chromosomes are associated with relatively mild effects. In 
part, this occurs because of a molecular process called X inactivation. 
Early in development, when female mammalian embryos consist of just a 
few thousand cells (relative to trillions in the newborn), one X chromosome 
in each cell inactivates by tightly condensing into a quiescent (dormant) 
structure called a Barr body. The chance that an X chromosome (maternally 
or paternally derived) is inactivated in each cell is random, but once the 
inactivation occurs, all cells derived from that one will have the same 
inactive X chromosome or Barr body. By this process, females compensate 
for their double genetic dose of X chromosome. In so-called “tortoiseshell” 
cats, embryonic X inactivation is observed as color variegation ({link]). 
Females that are heterozygous for an X-linked coat color gene will express 
one of two different coat colors over different regions of their body, 
corresponding to whichever X chromosome is inactivated in the embryonic 
cell progenitor of that region. 


In cats, the gene for 


coat color is located 
on the X chromosome. 
In the embryonic 
development of female 
cats, one of the two X 
chromosomes is 
randomly inactivated 
in each cell, resulting 
in a tortoiseshell 
pattern if the cat has 
two different alleles 
for coat color. Male 
cats, having only one 
X chromosome, never 
exhibit a tortoiseshell 
coat color. (credit: 
Michael Bodega) 


An individual carrying an abnormal number of X chromosomes will 
inactivate all but one X chromosome in each of her cells. However, even 
inactivated X chromosomes continue to express a few genes, and X 
chromosomes must reactivate for the proper maturation of female ovaries. 
As aresult, X-chromosomal abnormalities are typically associated with 
mild mental and physical defects, as well as sterility. If the X chromosome 
is absent altogether, the individual will not develop in utero. 


Several errors in sex chromosome number have been characterized. 
Individuals with three X chromosomes, called triplo-X, are phenotypically 
female but express developmental delays and reduced fertility. The XXY 
genotype, corresponding to one type of Klinefelter syndrome, corresponds 
to phenotypically male individuals with small testes, enlarged breasts, and 
reduced body hair. More complex types of Klinefelter syndrome exist in 
which the individual has as many as five X chromosomes. In all types, 
every X chromosome except one undergoes inactivation to compensate for 
the excess genetic dosage. This can be seen as several Barr bodies in each 


cell nucleus. Turner syndrome, characterized as an XO genotype (i.e., only a 
single sex chromosome), corresponds to a phenotypically female individual 
with short stature, webbed skin in the neck region, hearing and cardiac 
impairments, and sterility. 


Duplications and Deletions 


In addition to the loss or gain of an entire chromosome, a chromosomal 
segment may be duplicated or lost. Duplications and deletions often 
produce offspring that survive but exhibit physical and mental 
abnormalities. Duplicated chromosomal segments may fuse to existing 
chromosomes or may be free in the nucleus. Cri-du-chat (from the French 
for “cry of the cat”) is a syndrome associated with nervous system 
abnormalities and identifiable physical features that result from a deletion 
of most of 5p (the small arm of chromosome 5) ([link]). Infants with this 
genotype emit a characteristic high-pitched cry on which the disorder’s 
name is based. 


This individual with cri-du-chat 


syndrome is shown at two, four, 
nine, and 12 years of age. (credit: 
Paola Cerruti Mainardi) 


Chromosomal Structural Rearrangements 


Cytologists have characterized numerous structural rearrangements in 
chromosomes, but chromosome inversions and translocations are the most 
common. Both are identified during meiosis by the adaptive pairing of 
rearranged chromosomes with their former homologs to maintain 
appropriate gene alignment. If the genes carried on two homologs are not 
oriented correctly, a recombination event could result in the loss of genes 
from one chromosome and the gain of genes on the other. This would 
produce aneuploid gametes. 


Chromosome Inversions 


A chromosome inversion is the detachment, 180° rotation, and reinsertion 
of part of a chromosome. Inversions may occur in nature as a result of 
mechanical shear, or from the action of transposable elements (special DNA 
sequences capable of facilitating the rearrangement of chromosome 
segments with the help of enzymes that cut and paste DNA sequences). 
Unless they disrupt a gene sequence, inversions only change the orientation 
of genes and are likely to have more mild effects than aneuploid errors. 
However, altered gene orientation can result in functional changes because 
regulators of gene expression could be moved out of position with respect 
to their targets, causing aberrant levels of gene products. 


An inversion can be pericentric and include the centromere, or 
paracentric and occur outside of the centromere ([link]). A pericentric 
inversion that is asymmetric about the centromere can change the relative 
lengths of the chromosome arms, making these inversions easily 
identifiable. 


| Pericentric and Paracentric Inversions 
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Normal chromosome 


Pericentric inversion 


Paracentric inversion 


centromere 


Pericentric inversions include the 
centromere, and paracentric 
inversions do not. A pericentric 
inversion can change the relative 


lengths of the chromosome arms; 
a paracentric inversion cannot. 


When one homologous chromosome undergoes an inversion but the other 
does not, the individual is described as an inversion heterozygote. To 
maintain point-for-point synapsis during meiosis, one homolog must form a 
loop, and the other homolog must mold around it. Although this topology 
can ensure that the genes are correctly aligned, it also forces the homologs 
to stretch and can be associated with regions of imprecise synapsis ([link]). 


Inversion Pairing 


Conforming 
chromosome 


Looped chromosome 


When one chromosome undergoes 
an inversion but the other does 
not, one chromosome must form 
an inverted loop to retain point- 
for-point interaction during 
synapsis. This inversion pairing is 
essential to maintaining gene 
alignment during meiosis and to 
allow for recombination. 


Note: 


Evolution Connection 

The Chromosome 18 Inversion 

Not all structural rearrangements of chromosomes produce nonviable, 
impaired, or infertile individuals. In rare instances, such a change can 
result in the evolution of a new species. In fact, a pericentric inversion in 
chromosome 18 appears to have contributed to the evolution of humans. 
This inversion is not present in our closest genetic relatives, the 
chimpanzees. Humans and chimpanzees differ cytogenetically by 
pericentric inversions on several chromosomes and by the fusion of two 
separate chromosomes in chimpanzees that correspond to chromosome two 
in humans. 

The pericentric chromosome 18 inversion is believed to have occurred in 
early humans following their divergence from a common ancestor with 
chimpanzees approximately five million years ago. Researchers 
characterizing this inversion have suggested that approximately 19,000 
nucleotide bases were duplicated on 18p, and the duplicated region 
inverted and reinserted on chromosome 18 of an ancestral human. 

A comparison of human and chimpanzee genes in the region of this 
inversion indicates that two genes—ROCK1 and USP14—that are adjacent 
on chimpanzee chromosome 17 (which corresponds to human chromosome 
18) are more distantly positioned on human chromosome 18. This suggests 
that one of the inversion breakpoints occurred between these two genes. 
Interestingly, humans and chimpanzees express USP14 at distinct levels in 
specific cell types, including cortical cells and fibroblasts. Perhaps the 
chromosome 18 inversion in an ancestral human repositioned specific 
genes and reset their expression levels in a useful way. Because both 
ROCK1 and USP 14 encode cellular enzymes, a change in their expression 
could alter cellular function. It is not known how this inversion contributed 
to hominid evolution, but it appears to be a significant factor in the 
divergence of humans from other primates. !fo™oKte! 

Violaine Goidts et al., “Segmental duplication associated with the human- 
specific inversion of chromosome 18: a further example of the impact of 
segmental duplications on karyotype and genome evolution in primates,” 
Human Genetics. 115 (2004):116-122 


Translocations 


A translocation occurs when a segment of a chromosome dissociates and 
reattaches to a different, nonhomologous chromosome. Translocations can 
be benign or have devastating effects depending on how the positions of 
genes are altered with respect to regulatory sequences. Notably, specific 
translocations have been associated with several cancers and with 
schizophrenia. Reciprocal translocations result from the exchange of 
chromosome segments between two nonhomologous chromosomes such 
that there is no gain or loss of genetic information ({link]). 


Reciprocal Translocation 


Before Translocation After Translocation 


> 


No gain or loss 
of genetic information 


A reciprocal translocation occurs 
when a segment of DNA is 
transferred from one chromosome 
to another, nonhomologous 
chromosome. (credit: modification 
of work by National Human 
Genome Research/USA) 


Section Summary 


The number, size, shape, and banding pattern of chromosomes make them 
easily identifiable in a karyogram and allows for the assessment of many 
chromosomal abnormalities. Disorders in chromosome number, or 
aneuploidies, are typically lethal to the embryo, although a few trisomic 
genotypes are viable. Because of X inactivation, aberrations in sex 
chromosomes typically have milder phenotypic effects. Aneuploidies also 
include instances in which segments of a chromosome are duplicated or 
deleted. Chromosome structures may also be rearranged, for example by 
inversion or translocation. Both of these aberrations can result in 
problematic phenotypic effects. Because they force chromosomes to assume 
unnatural topologies during meiosis, inversions and translocations are often 
associated with reduced fertility because of the likelihood of 
nondisjunction. 


Art Connections 


Exercise: 


Problem: 
[link] Which of the following statements about nondisjunction is true? 


a. Nondisjunction only results in gametes with n+1 or n—1 
chromosomes. 

b. Nondisjunction occurring during meiosis II results in 50 percent 
normal gametes. 

c. Nondisjunction during meiosis I results in 50 percent normal 
gametes. 

d. Nondisjunction always results in four different kinds of gametes. 


Solution: 


[link] B. 


Review Questions 


Exercise: 
Problem: 


Which of the following codes describes position 12 on the long arm of 
chromosome 13? 


a. 13p12 
b. 13q12 
c. 12p13 
d. 12q13 


Solution: 


B 
Exercise: 


Problem: 


In agriculture, polyploid crops (like coffee, strawberries, or bananas) 
tend to produce 


a. more uniformity 
b. more variety 

c. larger yields 

d. smaller yields 


Solution: 


C 


Exercise: 


Problem: 


Assume a pericentric inversion occurred in one of two homologs prior 
to meiosis. The other homolog remains normal. During meiosis, what 
structure—if any—would these homologs assume in order to pair 
accurately along their lengths? 


a. V formation 

b. cruciform 

c. loop 

d. pairing would not be possible 


Solution: 


C 


Exercise: 


Problem:The genotype XXY corresponds to 


a. Klinefelter syndrome 
b. Turner syndrome 

c. Triplo-X 

d. Jacob syndrome 


Solution: 


A 
Exercise: 
Problem: 
Abnormalities in the number of X chromosomes tends to have milder 


phenotypic effects than the same abnormalities in autosomes because 
of 


a. deletions 
b. nonhomologous recombination 
c. synapsis 
d. X inactivation 
Solution: 
D 
Exercise: 


Problem:By definition, a pericentric inversion includes the 


a. centromere 
b. chiasma 

c. telomere 

d. synapse 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


Using diagrams, illustrate how nondisjunction can result in an 
aneuploid zygote. 


Solution: 


Exact diagram style will vary; diagram should look like [Link]. 


Glossary 


aneuploid 
individual with an error in chromosome number; includes deletions 
and duplications of chromosome segments 


autosome 
any of the non-sex chromosomes 


chromosome inversion 
detachment, 180° rotation, and reinsertion of a chromosome arm 


euploid 
individual with the appropriate number of chromosomes for their 
species 


karyogram 
photographic image of a karyotype 


karyotype 
number and appearance of an individuals chromosomes; includes the 
size, banding patterns, and centromere position 


monosomy 
otherwise diploid genotype in which one chromosome is missing 


nondisjunction 
failure of synapsed homologs to completely separate and migrate to 
separate poles during the first cell division of meiosis 


paracentric 
inversion that occurs outside of the centromere 


pericentric 
inversion that involves the centromere 


polyploid 
individual with an incorrect number of chromosome sets 


translocation 
process by which one segment of a chromosome dissociates and 
reattaches to a different, nonhomologous chromosome 


trisomy 
otherwise diploid genotype in which one entire chromosome is 
duplicated 


X inactivation 
condensation of X chromosomes into Barr bodies during embryonic 
development in females to compensate for the double genetic dose 


Introduction 
class="introduction" 


In 
genomics, 
the DNA of 
different 
organisms is 
compared, 
enabling 
scientists to 
create maps 
with which 
to navigate 
the DNA of 
different 
organisms. 
(credit 
"map": 
modificatio 
n of photo 
by NASA) 


The study of nucleic acids began with the discovery of DNA, progressed to 
the study of genes and small fragments, and has now exploded to the field 
of genomics. Genomics is the study of entire genomes, including the 
complete set of genes, their nucleotide sequence and organization, and their 
interactions within a species and with other species. The advances in 
genomics have been made possible by DNA sequencing technology. Just as 
information technology has led to Google maps that enable people to get 
detailed information about locations around the globe, genomic information 
is used to create similar maps of the DNA of different organisms. These 
findings have helped anthropologists to better understand human migration 
and have aided the field of medicine through the mapping of human genetic 
diseases. The ways in which genomic information can contribute to 
scientific understanding are varied and quickly growing. 


Mapping Genomes 
By the end of this section, you will be able to: 


¢ Define genomics 
e Describe genetic and physical maps 
e Describe genomic mapping methods 


Genomics is the study of entire genomes, including the complete set of 
genes, their nucleotide sequence and organization, and their interactions 
within a species and with other species. Genome mapping is the process of 
finding the locations of genes on each chromosome. The maps created by 
genome mapping are comparable to the maps that we use to navigate 
streets. A genetic map is an illustration that lists genes and their location on 
a chromosome. Genetic maps provide the big picture (similar to a map of 
interstate highways) and use genetic markers (similar to landmarks). A 
genetic marker is a gene or sequence on a chromosome that co-segregates 
(shows genetic linkage) with a specific trait. Early geneticists called this 
linkage analysis. Physical maps present the intimate details of smaller 
regions of the chromosomes (similar to a detailed road map). A physical 
map is a representation of the physical distance, in nucleotides, between 
genes or genetic markers. Both genetic linkage maps and physical maps are 
required to build a complete picture of the genome. Having a complete map 
of the genome makes it easier for researchers to study individual genes. 
Human genome maps help researchers in their efforts to identify human 
disease-causing genes related to illnesses like cancer, heart disease, and 
cystic fibrosis. Genome mapping can be used in a variety of other 
applications, such as using live microbes to clean up pollutants or even 
prevent pollution. Research involving plant genome mapping may lead to 
producing higher crop yields or developing plants that better adapt to 
climate change. 


Genetic Maps 


The study of genetic maps begins with linkage analysis, a procedure that 
analyzes the recombination frequency between genes to determine if they 
are linked or show independent assortment. The term linkage was used 
before the discovery of DNA. Early geneticists relied on the observation of 


phenotypic changes to understand the genotype of an organism. Shortly 
after Gregor Mendel (the father of modern genetics) proposed that traits 
were determined by what are now known as genes, other researchers 
observed that different traits were often inherited together, and thereby 
deduced that the genes were physically linked by being located on the same 
chromosome. The mapping of genes relative to each other based on linkage 
analysis led to the development of the first genetic maps. 


Observations that certain traits were always linked and certain others were 
not linked came from studying the offspring of crosses between parents 
with different traits. For example, in experiments performed on the garden 
pea, it was discovered that the color of the flower and shape of the plant’s 
pollen were linked traits, and therefore the genes encoding these traits were 
in close proximity on the same chromosome. The exchange of DNA 
between homologous pairs of chromosomes is called genetic 
recombination, which occurs by the crossing over of DNA between 
homologous strands of DNA, such as nonsister chromatids. Linkage 
analysis involves studying the recombination frequency between any two 
genes. The greater the distance between two genes, the higher the chance 
that a recombination event will occur between them, and the higher the 
recombination frequency between them. Two possibilities for 
recombination between two nonsister chromatids during meiosis are shown 
in [link]. If the recombination frequency between two genes is less than 50 
percent, they are said to be linked. 


Crossover region resulting 
in A-B recombination 


Crossover region resulting 
cc c in B-C recombination 


a A 
b B 
Cc c Cc c C c 
a A 
b B 
c Cc 
Crossover may occur at different locations on the 
chromosome. Recombination between genes A and 
B is more frequent than recombination between 
genes B and C because genes A and B are farther 
apart; a crossover is therefore more likely to occur 
between them. 


The generation of genetic maps requires markers, just as a road map 
requires landmarks (such as rivers and mountains). Early genetic maps were 
based on the use of known genes as markers. More sophisticated markers, 
including those based on non-coding DNA, are now used to compare the 
genomes of individuals in a population. Although individuals of a given 
species are genetically similar, they are not identical; every individual has a 
unique set of traits. These minor differences in the genome between 
individuals in a population are useful for the purposes of genetic mapping. 
In general, a good genetic marker is a region on the chromosome that shows 
variability or polymorphism (multiple forms) in the population. 


Some genetic markers used in generating genetic maps are restriction 
fragment length polymorphisms (RFLP), variable number of tandem 


repeats (VNTRs), microsatellite polymorphisms, and the single 
nucleotide polymorphisms (SNPs). RFLPs (sometimes pronounced “‘rif- 
lips”) are detected when the DNA of an individual is cut with a restriction 
endonuclease that recognizes specific sequences in the DNA to generate a 
series of DNA fragments, which are then analyzed by gel electrophoresis. 
The DNA of every individual will give rise to a unique pattern of bands 
when cut with a particular set of restriction endonucleases; this is 
sometimes referred to as an individual’s DNA “fingerprint.” Certain regions 
of the chromosome that are subject to polymorphism will lead to the 
generation of the unique banding pattern. VNTRs are repeated sets of 
nucleotides present in the non-coding regions of DNA. Non-coding, or 
“junk,” DNA has no known biological function; however, research shows 
that much of this DNA is actually transcribed. While its function is 
uncertain, it is certainly active, and it may be involved in the regulation of 
coding genes. The number of repeats may vary in individual organisms of a 
population. Microsatellite polymorphisms are similar to VNTRs, but the 
repeat unit is very small. SNPs are variations in a single nucleotide. 


Because genetic maps rely completely on the natural process of 
recombination, mapping is affected by natural increases or decreases in the 
level of recombination in any given area of the genome. Some parts of the 
genome are recombination hotspots, whereas others do not show a 
propensity for recombination. For this reason, it is important to look at 
mapping information developed by multiple methods. 


Physical Maps 


A physical map provides detail of the actual physical distance between 
genetic markers, as well as the number of nucleotides. There are three 
methods used to create a physical map: cytogenetic mapping, radiation 
hybrid mapping, and sequence mapping. Cytogenetic mapping uses 
information obtained by microscopic analysis of stained sections of the 
chromosome ((link]). It is possible to determine the approximate distance 
between genetic markers using cytogenetic mapping, but not the exact 
distance (number of base pairs). Radiation hybrid mapping uses radiation, 
such as x-rays, to break the DNA into fragments. The amount of radiation 
can be adjusted to create smaller or larger fragments. This technique 


overcomes the limitation of genetic mapping and is not affected by 
increased or decreased recombination frequency. Sequence mapping 
resulted from DNA sequencing technology that allowed for the creation of 
detailed physical maps with distances measured in terms of the number of 
base pairs. The creation of genomic libraries and complementary DNA 
(cDNA) libraries (collections of cloned sequences or all DNA from a 
genome) has sped up the process of physical mapping. A genetic site used 
to generate a physical map with sequencing technology (a sequence-tagged 
site, or STS) is a unique sequence in the genome with a known exact 
chromosomal location. An expressed sequence tag (EST) and a single 
sequence length polymorphism (SSLP) are common STSs. An EST is a 
short STS that is identified with cDNA libraries, while SSLPs are obtained 
from known genetic markers and provide a link between genetic maps and 
physical maps. 


Autosomes Sex Chromosomes 


A cytogenetic map shows the 
appearance of a chromosome after 
it is stained and examined under a 

microscope. (credit: National 

Human Genome Research 
Institute) 


Integration of Genetic and Physical Maps 


Genetic maps provide the outline and physical maps provide the details. It 
is easy to understand why both types of genome mapping techniques are 


important to show the big picture. Information obtained from each 
technique is used in combination to study the genome. Genomic mapping is 
being used with different model organisms that are used for research. 
Genome mapping is still an ongoing process, and as more advanced 
techniques are developed, more advances are expected. Genome mapping is 
similar to completing a complicated puzzle using every piece of available 
data. Mapping information generated in laboratories all over the world is 
entered into central databases, such as GenBank at the National Center for 
Biotechnology Information (NCBI). Efforts are being made to make the 
information more easily accessible to researchers and the general public. 
Just as we use global positioning systems instead of paper maps to navigate 
through roadways, NCBI has created a genome viewer tool to simplify the 
data-mining process. 


Note: 

Scientific Method Connection 

How to Use a Genome Map Viewer 

Problem statement: Do the human, macaque, and mouse genomes contain 
common DNA sequences? 

Develop a hypothesis. 

To test the hypothesis, click this link. 

In Search box on the left panel, type any gene name or phenotypic 
characteristic, such as iris pigmentation (eye color). Select the species you 
want to study, and then press Enter. The genome map viewer will indicate 
which chromosome encodes the gene in your search. Click each hit in the 
genome viewer for more detailed information. This type of search is the 
most basic use of the genome viewer; it can also be used to compare 
sequences between species, as well as many other complicated tasks. 

Is the hypothesis correct? Why or why not? 


Note: 
Link to Learning 


Online Mendelian Inheritance in Man (OMIM) is a searchable online 
catalog of human genes and genetic disorders. This website shows genome 
mapping information, and also details the history and research of each trait 
and disorder. Click this link to search for traits (such as handedness) and 
genetic disorders (such as diabetes). 


Section Summary 


Genome mapping is similar to solving a big, complicated puzzle with 
pieces of information coming from laboratories all over the world. Genetic 
maps provide an outline for the location of genes within a genome, and they 
estimate the distance between genes and genetic markers on the basis of 
recombination frequencies during meiosis. Physical maps provide detailed 
information about the physical distance between the genes. The most 
detailed information is available through sequence mapping. Information 
from all mapping and sequencing sources is combined to study an entire 
genome. 


Review Questions 


Exercise: 


Problem:ESTs are 


a. generated after a cDNA library is made 

b. unique sequences in the genome 

c. useful for mapping using sequence information 
d. all of the above 


Solution: 


D 


Exercise: 


Problem: Linkage analysis 


a. is used to create a physical map 

b. is based on the natural recombination process 

c. requires radiation hybrid mapping 

d. involves breaking and re-joining of DNA artificially 


Solution: 


B 


Exercise: 


Problem: Genetic recombination occurs by which process? 


a. independent assortment 
b. crossing over 

c. chromosome segregation 
d. sister chromatids 


Solution: 
B 
Exercise: 
Problem: Individual genetic maps in a given species are: 


a. genetically similar 
b. genetically identical 


c. genetically dissimilar 
d. not useful in species analysis 


Solution: 


A 
Exercise: 


Problem: 


Information obtained by microscopic analysis of stained chromosomes 
is used in: 


a. radiation hybrid mapping 
b. sequence mapping 

c. RFLP mapping 

d. cytogenetic mapping 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


Why is so much effort being poured into genome mapping 
applications? 


Solution: 


Genome mapping has many different applications and provides 
comprehensive information that can be used for predictive purposes. 


Exercise: 


Problem: 


How could a genetic map of the human genome help find a cure for 
cancer? 


Solution: 


A human genetic map can help identify genetic markers and sequences 
associated with high cancer risk, which can help to screen and provide 
early detection of different types of cancer. 


Glossary 


cytogenetic mapping 
technique that uses a microscope to create a map from stained 
chromosomes 


expressed sequence tag (EST) 
short STS that is identified with cDNA 


genetic map 
outline of genes and their location on a chromosome 


genetic marker 
gene or sequence on a chromosome with a known location that is 
associated with a specific trait 


genetic recombination 
exchange of DNA between homologous pairs of chromosomes 


genome mapping 
process of finding the location of genes on each chromosome 


cDNA library 
collection of cloned cDNA sequences 


genomic library 


collection of cloned DNA which represents all of the sequences and 
fragments from a genome 


genomics 
study of entire genomes including the complete set of genes, their 
nucleotide sequence and organization, and their interactions within a 
species and with other species 


linkage analysis 
procedure that analyzes the recombination of genes to determine if 
they are linked 


microsatellite polymorphism 
variation between individuals in the sequence and number of repeats of 
microsatellite DNA 


physical map 
representation of the physical distance between genes or genetic 
markers 


radiation hybrid mapping 
information obtained by fragmenting the chromosome with x-rays 


restriction fragment length polymorphism (RFLP) 
variation between individuals in the length of DNA fragments 
generated by restriction endonucleases 


sequence mapping 
mapping information obtained after DNA sequencing 


single nucleotide polymorphism (SNP) 
variation between individuals in a single nucleotide 


variable number of tandem repeats (VNTRs) 
variation in the number of tandem repeats between individuals in the 
population 


Whole-Genome Sequencing 
By the end of this section, you will be able to: 


e Describe three types of sequencing 
¢ Define whole-genome sequencing 


Although there have been significant advances in the medical sciences in 
recent years, doctors are still confounded by some diseases, and they are 
using whole-genome sequencing to get to the bottom of the problem. 
Whole-genome sequencing is a process that determines the DNA sequence 
of an entire genome. Whole-genome sequencing is a brute-force approach 
to problem solving when there is a genetic basis at the core of a disease. 
Several laboratories now provide services to sequence, analyze, and 
interpret entire genomes. 


For example, whole-exome sequencing is a lower-cost alternative to whole 
genome sequencing. In exome sequencing, only the coding, exon-producing 
regions of the DNA are sequenced. In 2010, whole-exome sequencing was 
used to save a young boy whose intestines had multiple mysterious 
abscesses. The child had several colon operations with no relief. Finally, 
whole-exome sequencing was performed, which revealed a defect in a 
pathway that controls apoptosis (programmed cell death). A bone-marrow 
transplant was used to overcome this genetic disorder, leading to a cure for 
the boy. He was the first person to be successfully treated based on a 
diagnosis made by whole-exome sequencing. Today, human genome 
sequencing is more readily available and can be completed in a day or two 
for about $1000. 


Strategies Used in Sequencing Projects 


The basic sequencing technique used in all modern day sequencing projects 
is the chain termination method (also known as the dideoxy method), which 
was developed by Fred Sanger in the 1970s. The chain termination method 
involves DNA replication of a single-stranded template with the use of a 
primer and a regular deoxynucleotide (dNTP), which is a monomer, or a 
single unit, of DNA. The primer and dNTP are mixed with a small 
proportion of fluorescently labeled dideoxynucleotides (ddNTPs). The 


ddNTPs are monomers that are missing a hydroxyl group (—OH) at the site 
at which another nucleotide usually attaches to form a chain ({link]). Each 
ddNTP is labeled with a different color of fluorophore. Every time a ddNTP 
is incorporated in the growing complementary strand, it terminates the 
process of DNA replication, which results in multiple short strands of 
replicated DNA that are each terminated at a different point during 
replication. When the reaction mixture is processed by gel electrophoresis 
after being separated into single strands, the multiple newly replicated DNA 
strands form a ladder because of the differing sizes. Because the ddNTPs 
are fluorescently labeled, each band on the gel reflects the size of the DNA 
strand and the ddNTP that terminated the reaction. The different colors of 
the fluorophore-labeled ddNTPs help identify the ddNTP incorporated at 
that position. Reading the gel on the basis of the color of each band on the 
ladder produces the sequence of the template strand ([link]). 


Dideoxynucleotide (ddNTP) 


Base 


©-O-O- oc 


H 


OH 
Deoxynucleotide (dNTP) 


A dideoxynucleotide is similar in 
structure to a deoxynucleotide, but is 
missing the 3' hydroxyl group 
(indicated by the box). When a 
dideoxynucleotide is incorporated into 
a DNA strand, DNA synthesis stops. 


ddTTP 
G Cc G Cc 
120 130 
Dye-labeled dideoxynucleotides are used to 
generate DNA fragments of different lengths. GAT AAAT CT GGTCTTATTTCC 


Frederick Sanger's dideoxy chain termination 
method is illustrated. Using dideoxynucleotides, 
the DNA fragment can be terminated at different 

points. The DNA is separated on the basis of size, 
and these bands, based on the size of the 
fragments, can be read. 


Early Strategies: Shotgun Sequencing and Pair-Wise End Sequencing 


In shotgun sequencing method, several copies of a DNA fragment are cut 
randomly into many smaller pieces (somewhat like what happens to a round 
shot cartridge when fired from a shotgun). All of the segments are then 
sequenced using the chain-sequencing method. Then, with the help of a 
computer, the fragments are analyzed to see where their sequences overlap. 
By matching up overlapping sequences at the end of each fragment, the 
entire DNA sequence can be reformed. A larger sequence that is assembled 
from overlapping shorter sequences is called a contig. As an analogy, 
consider that someone has four copies of a landscape photograph that you 
have never seen before and know nothing about how it should appear. The 
person then rips up each photograph with their hands, so that different size 
pieces are present from each copy. The person then mixes all of the pieces 
together and asks you to reconstruct the photograph. In one of the smaller 
pieces you see a mountain. In a larger piece, you see that the same mountain 


is behind a lake. A third fragment shows only the lake, but it reveals that 
there is a cabin on the shore of the lake. Therefore, from looking at the 
overlapping information in these three fragments, you know that the picture 
contains a mountain behind a lake that has a cabin on its shore. This is the 
principle behind reconstructing entire DNA sequences using shotgun 
sequencing. 


Originally, shotgun sequencing only analyzed one end of each fragment for 
overlaps. This was sufficient for sequencing small genomes. However, the 
desire to sequence larger genomes, such as that of a human, led to the 
development of double-barrel shotgun sequencing, more formally known as 
pairwise-end sequencing. In pairwise-end sequencing, both ends of each 
fragment are analyzed for overlap. Pairwise-end sequencing is, therefore, 
more cumbersome than shotgun sequencing, but it is easier to reconstruct 
the sequence because there is more available information. 


Next-generation Sequencing 


Since 2005, automated sequencing techniques used by laboratories are 
under the umbrella of next-generation sequencing, which is a group of 
automated techniques used for rapid DNA sequencing. These automated 
low-cost sequencers can generate sequences of hundreds of thousands or 
millions of short fragments (25 to 500 base pairs) in the span of one day. 
These sequencers use sophisticated software to get through the cumbersome 
process of putting all the fragments in order. 


Note: 

Evolution Connection 

Comparing Sequences 

A sequence alignment is an arrangement of proteins, DNA, or RNA; it is 
used to identify regions of similarity between cell types or species, which 
may indicate conservation of function or structures. Sequence alignments 
may be used to construct phylogenetic trees. The following website uses a 
software program called BLAST (basic local alignment search tool). 


Under “Basic Blast,” click “Nucleotide Blast.” Input the following 
sequence into the large "query sequence" box: ATTGCTTCGATTGCA. 
Below the box, locate the "Species" field and type "human" or "Homo 
sapiens”. Then click “BLAST” to compare the inputted sequence against 
known sequences of the human genome. The result is that this sequence 
occurs in over a hundred places in the human genome. Scroll down below 
the graphic with the horizontal bars and you will see short description of 
each of the matching hits. Pick one of the hits near the top of the list and 
click on "Graphics". This will bring you to a page that shows where the 
sequence is found within the entire human genome. You can move the 
slider that looks like a green flag back and forth to view the sequences 
immediately around the selected gene. You can then return to your selected 
sequence by clicking the "ATG" button. 


Use of Whole-Genome Sequences of Model Organisms 


The first genome to be completely sequenced was of a bacterial virus, the 
bacteriophage fx174 (5368 base pairs); this was accomplished by Fred 
Sanger using shotgun sequencing. Several other organelle and viral 
genomes were later sequenced. The first organism whose genome was 
sequenced was the bacterium Haemophilus influenzae; this was 
accomplished by Craig Venter in the 1980s. Approximately 74 different 
laboratories collaborated on the sequencing of the genome of the yeast 
Saccharomyces cerevisiae, which began in 1989 and was completed in 
1996, because it was 60 times bigger than any other genome that had been 
sequenced. By 1997, the genome sequences of two important model 
organisms were available: the bacterium Escherichia coli K12 and the yeast 
Saccharomyces cerevisiae. Genomes of other model organisms, such as the 
mouse Mus musculus, the fruit fly Drosophila melanogaster, the nematode 
Caenorhabditis. elegans, and humans Homo sapiens are now known. A lot 
of basic research is performed in model organisms because the information 
can be applied to genetically similar organisms. A model organism is a 
species that is studied as a model to understand the biological processes in 
other species represented by the model organism. Having entire genomes 
sequenced helps with the research efforts in these model organisms. The 


process of attaching biological information to gene sequences is called 
genome annotation. Annotation of gene sequences helps with basic 
experiments in molecular biology, such as designing PCR primers and RNA 
targets. 


Note: 
Link to Learning 
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Click through each step of genome sequencing at this site. 


Uses of Genome Sequences 


DNA microarrays are methods used to detect gene expression by 
analyzing an array of DNA fragments that are fixed to a glass slide or a 
silicon chip to identify active genes and identify sequences. Almost one 
million genotypic abnormalities can be discovered using microarrays, 
whereas whole-genome sequencing can provide information about all six 
billion base pairs in the human genome. Although the study of medical 
applications of genome sequencing is interesting, this discipline tends to 
dwell on abnormal gene function. Knowledge of the entire genome will 
allow future onset diseases and other genetic disorders to be discovered 
early, which will allow for more informed decisions to be made about 
lifestyle, medication, and having children. Genomics is still in its infancy, 
although someday it may become routine to use whole-genome sequencing 
to screen every newborn to detect genetic abnormalities. 


In addition to disease and medicine, genomics can contribute to the 
development of novel enzymes that convert biomass to biofuel, which 


results in higher crop and fuel production, and lower cost to the consumer. 
This knowledge should allow better methods of control over the microbes 
that are used in the production of biofuels. Genomics could also improve 
the methods used to monitor the impact of pollutants on ecosystems and 
help clean up environmental contaminants. Genomics has allowed for the 
development of agrochemicals and pharmaceuticals that could benefit 
medical science and agriculture. 


It sounds great to have all the knowledge we can get from whole-genome 
sequencing; however, humans have a responsibility to use this knowledge 
wisely. Otherwise, it could be easy to misuse the power of such knowledge, 
leading to discrimination based on a person's genetics, human genetic 
engineering, and other ethical concerns. This information could also lead to 
legal issues regarding health and privacy. 


Section Summary 


Whole-genome sequencing is the latest available resource to treat genetic 
diseases. Some doctors are using whole-genome sequencing to save lives. 
Genomics has many industrial applications including biofuel development, 
agriculture, pharmaceuticals, and pollution control. The basic principle of 
all modern-day sequencing strategies involves the chain termination method 
of sequencing. 


Although the human genome sequences provide key insights to medical 
professionals, researchers use whole-genome sequences of model organisms 
to better understand the genome of the species. Automation and the 
decreased cost of whole-genome sequencing may lead to personalized 
medicine in the future. 


Review Questions 


Exercise: 


Problem:The chain termination method of sequencing: 


a. uses labeled ddNTPs 

b. uses only dideoxynucleotides 
c. uses only deoxynucleotides 
d. uses labeled dNTPs 


Solution: 


A 


Exercise: 


Problem: Whole-genome sequencing can be used for advances in: 


a. the medical field 
b. agriculture 

c. biofuels 

d. all of the above 


Solution: 


D 


Exercise: 


Problem: Sequencing an individual person’s genome 


a. is currently possible 

b. could lead to legal issues regarding discrimination and privacy 
c. could help make informed choices about medical treatment 

d. all of the above 


Solution: 


D 


Exercise: 


Problem: 
What is the most challenging issue facing genome sequencing? 


a. the inability to develop fast and accurate sequencing techniques 

b. the ethics of using information from genomes at the individual 
level 

c. the availability and stability of DNA 

d. all of the above 


Solution: 


B 


Glossary 


chain termination method 
method of DNA sequencing using labeled dideoxynucleotides to 
terminate DNA replication; it is also called the dideoxy method or the 
Sanger method 


contig 
larger sequence of DNA assembled from overlapping shorter 
sequences 


deoxynucleotide 
individual monomer (single unit) of DNA 


dideoxynucleotide 
individual monomer of DNA that is missing a hydroxyl group (-OH) 


DNA microarray 
method used to detect gene expression by analyzing an array of DNA 
fragments that are fixed to a glass slide or a silicon chip to identify 
active genes and identify sequences 


genome annotation 
process of attaching biological information to gene sequences 


model organism 
species that is studied and used as a model to understand the biological 
processes in other species represented by the model organism 


next-generation sequencing 
group of automated techniques used for rapid DNA sequencing 


shotgun sequencing 
method used to sequence multiple DNA fragments to generate the 
sequence of a large piece of DNA 


whole-genome sequencing 
process that determines the DNA sequence of an entire genome 


Introduction 
class="introduction" 


Dolly 
the 
sheep 
was the 
first 
large 
mamma 
1 to be 
cloned. 


The three letters “DNA” have now become synonymous with crime 
solving, paternity testing, human identification, and genetic testing. DNA 
can be retrieved from hair, blood, or saliva. Each person’s DNA is unique, 
and it is possible to detect differences between individuals within a species 
on the basis of these unique features. 


DNA analysis has many practical applications beyond forensics. In humans, 
DNA testing is applied to numerous uses: determining paternity, tracing 


genealogy, identifying pathogens, archeological research, tracing disease 
outbreaks, and studying human migration patterns. In the medical field, 
DNA is used in diagnostics, new vaccine development, and cancer therapy. 
It is now possible to determine predisposition to diseases by looking at 
genes. 


Each human cell has 23 pairs of chromosomes: one set of chromosomes is 
inherited from the mother and the other set is inherited from the father. 
There is also a mitochondrial genome, inherited exclusively from the 
mother, which can be involved in inherited genetic disorders. On each 
chromosome, there are thousands of genes that are responsible for 
determining the genotype and phenotype of the individual. A gene is 
defined as a sequence of DNA that codes for a functional product. The 
human haploid genome contains 3 billion base pairs and has between 
20,000 and 25,000 functional genes. 


Historical Basis of Modern Understanding 
By the end of this section, you will be able to: 


e Explain transformation of DNA 

e Describe the key experiments that helped identify that DNA is the 
genetic material 

e State and explain Chargaff’s rules 


Modern understandings of DNA have evolved from the discovery of 
nucleic acid to the development of the double-helix model. In the 1860s, 
Friedrich Miescher ((link]), a physician by profession, was the first person 
to isolate phosphate-rich chemicals from white blood cells or leukocytes. 
He named these chemicals (which would eventually be known as RNA and 
DNA) nuclein because they were isolated from the nuclei of the cells. 


Friedrich 
Miescher 
(1844-1895) 
discovered 
nucleic acids. 


Note: 
Link to Learning 


To see Miescher conduct an experiment step-by-step, click through this 
review of how he discovered the key role of DNA and proteins in the 
nucleus. 


A half century later, British bacteriologist Frederick Griffith was perhaps 
the first person to show that hereditary information could be transferred 
from one cell to another “horizontally,” rather than by descent. In 1928, he 
reported the first demonstration of bacterial transformation, a process in 
which external DNA is taken up by a cell, thereby changing morphology 
and physiology. He was working with Streptococcus pneumoniae, the 
bacterium that causes pneumonia. Griffith worked with two strains, rough 
(R) and smooth (S). The R strain is non-pathogenic (does not cause disease) 
and is called rough because its outer surface is a cell wall and lacks a 
capsule; as a result, the cell surface appears uneven under the microscope. 
The S strain is pathogenic (disease-causing) and has a capsule outside its 
cell wall. As a result, it has a smooth appearance under the microscope. 
Griffith injected the live R strain into mice and they survived. In another 
experiment, when he injected mice with the heat-killed S strain, they also 
survived. In a third set of experiments, a mixture of live R strain and heat- 
killed S strain were injected into mice, and—to his surprise—the mice died. 
Upon isolating the live bacteria from the dead mouse, only the S strain of 
bacteria was recovered. When this isolated S strain was injected into fresh 
mice, the mice died. Griffith concluded that something had passed from the 
heat-killed S strain into the live R strain and transformed it into the 
pathogenic S strain, and he called this the transforming principle ([link]). 
These experiments are now famously known as Griffith's transformation 
experiments. 


Mouse injected with heat-killed virulant S strain Mouse injected with both heat-killed S strain 
lives. and live non-virulant R strain dies. 


Two strains of S. pneumoniae were used in Griffith’s 
transformation experiments. The R strain is non-pathogenic. 
The S strain is pathogenic and causes death. When Griffith 
injected a mouse with the heat-killed S strain and a live R 
strain, the mouse died. The S strain was recovered from the 
dead mouse. Thus, Griffith concluded that something had 
passed from the heat-killed S strain to the R strain, 
transforming the R strain into S strain in the process. (credit 
"living mouse": modification of work by NIH; credit "dead 
mouse": modification of work by Sarah Marriage) 


Scientists Oswald Avery, Colin MacLeod, and Maclyn McCarty (1944) 
were interested in exploring this transforming principle further. They 
isolated the S strain from the dead mice and isolated the proteins and 
nucleic acids, namely RNA and DNA, as these were possible candidates for 
the molecule of heredity. They conducted a systematic elimination study. 
They used enzymes that specifically degraded each component and then 
used each mixture separately to transform the R strain. They found that 
when DNA was degraded, the resulting mixture was no longer able to 
transform the bacteria, whereas all of the other combinations were able to 
transform the bacteria. This led them to conclude that DNA was the 
transforming principle. 


Note: 

Career Connection 

Forensic Scientists and DNA Analysis 

DNA evidence was used for the first time to solve an immigration case. 
The story started with a teenage boy returning to London from Ghana to be 
with his mother. Immigration authorities at the airport were suspicious of 
him, thinking that he was traveling on a forged passport. After much 
persuasion, he was allowed to go live with his mother, but the immigration 
authorities did not drop the case against him. All types of evidence, 
including photographs, were provided to the authorities, but deportation 
proceedings were started nevertheless. Around the same time, Dr. Alec 
Jeffreys of Leicester University in the United Kingdom had invented a 
technique known as DNA fingerprinting. The immigration authorities 
approached Dr. Jeffreys for help. He took DNA samples from the mother 
and three of her children, plus an unrelated mother, and compared the 
samples with the boy’s DNA. Because the biological father was not in the 
picture, DNA from the three children was compared with the boy’s DNA. 
He found a match in the boy’s DNA for both the mother and his three 
siblings. He concluded that the boy was indeed the mother’s son. 

Forensic scientists analyze many items, including documents, handwriting, 
firearms, and biological samples. They analyze the DNA content of hair, 
semen, saliva, and blood, and compare it with a database of DNA profiles 
of known criminals. Analysis includes DNA isolation, sequencing, and 
sequence analysis; most forensic DNA analysis involves polymerase chain 
reaction (PCR) amplification of short tandem repeat (STR) loci and 
electrophoresis to determine the length of the PCR-amplified fragment. 
Only mitochondrial DNA is sequenced for forensics. Forensic scientists are 
expected to appear at court hearings to present their findings. They are 
usually employed in crime labs of city and state government agencies. 
Geneticists experimenting with DNA techniques also work for scientific 
and research organizations, pharmaceutical industries, and college and 
university labs. Students wishing to pursue a career as a forensic scientist 
should have at least a bachelor's degree in chemistry, biology, or physics, 
and preferably some experience working in a laboratory. 


Experiments conducted by Martha Chase and Alfred Hershey in 1952 
provided confirmatory evidence that DNA was the genetic material and not 
proteins. Chase and Hershey were studying a bacteriophage, which is a 
virus that infects bacteria. Viruses typically have a simple structure: a 
protein coat, called the capsid, and a nucleic acid core that contains the 
genetic material, either DNA or RNA. The bacteriophage infects the host 
bacterial cell by attaching to its surface, and then it injects its nucleic acids 
inside the cell. The phage DNA makes multiple copies of itself using the 
host machinery, and eventually the host cell bursts, releasing a large number 
of bacteriophages. Hershey and Chase labeled one batch of phage with 
radioactive sulfur, ?°S, to label the protein coat. Another batch of phage 
were labeled with radioactive phosphorus, **P. Because phosphorous is 
found in DNA, but not protein, the DNA and not the protein would be 
tagged with radioactive phosphorus. 


Each batch of phage was allowed to infect the cells separately. After 
infection, the phage bacterial suspension was put in a blender, which caused 
the phage coat to be detached from the host cell. The phage and bacterial 
suspension was spun down in a centrifuge. The heavier bacterial cells 
settled down and formed a pellet, whereas the lighter phage particles stayed 
in the supernatant. In the tube that contained phage labeled with °°S, the 
supernatant contained the radioactively labeled phage, whereas no 
radioactivity was detected in the pellet. In the tube that contained the phage 
labeled with °7P, the radioactivity was detected in the pellet that contained 
the heavier bacterial cells, and no radioactivity was detected in the 
supernatant. Hershey and Chase concluded that it was the phage DNA that 
was injected into the cell and carried information to produce more phage 
particles, thus providing evidence that DNA was the genetic material and 
not proteins ([link]). 
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In Hershey and Chase's experiments, 
bacteria were infected with phage 
radiolabeled with either 2°S, which labels 
protein, or 7*P, which labels DNA. Only °?P 
entered the bacterial cells, indicating that 
DNA is the genetic material. 


Around this same time, Austrian biochemist Erwin Chargaff examined the 
content of DNA in different species and found that the amounts of adenine, 
thymine, guanine, and cytosine were not found in equal quantities, and that 
it varied from species to species, but not between individuals of the same 
species. He found that the amount of adenine equals the amount of thymine, 
and the amount of cytosine equals the amount of guanine, or A = T and G = 
C. This is also known as Chargaff’s rules. This finding proved immensely 
useful when Watson and Crick were getting ready to propose their DNA 
double helix model. 


Section Summary 


DNA was first isolated from white blood cells by Friedrich Miescher, who 
called it nuclein because it was isolated from nuclei. Frederick Griffith's 
experiments with strains of Streptococcus pneumoniae provided the first 
hint that DNA may be the transforming principle. Avery, MacLeod, and 
McCarty proved that DNA is required for the transformation of bacteria. 
Later experiments by Hershey and Chase using bacteriophage T2 proved 
that DNA is the genetic material. Chargaff found that the ratio of A = T and 
C =G, and that the percentage content of A, T, G, and C is different for 
different species. 


Review Questions 


Exercise: 
Problem: 


If DNA of a particular species was analyzed and it was found that it 
contains 27 percent A, what would be the percentage of C? 


a. 27 percent 
b. 30 percent 
c. 23 percent 
d. 54 percent 


Solution: 


‘e 
Exercise: 
Problem: 


The experiments by Hershey and Chase helped confirm that DNA was 
the hereditary material on the basis of the finding that: 


a. radioactive phage were found in the pellet 
b. radioactive cells were found in the supernatant 
c. radioactive sulfur was found inside the cell 


d. radioactive phosphorus was found in the cell 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


Explain Griffith's transformation experiments. What did he conclude 
from them? 


Solution: 


Live R cells acquired genetic information from the heat-killed S cells 
that “transformed” the R cells into S cells. 


Exercise: 


Problem: 


Why were radioactive sulfur and phosphorous used to label 
bacteriophage in Hershey and Chase's experiments? 


Solution: 
Sulfur is an element found in proteins and phosphorus is a component 
of nucleic acids. 

Glossary 


transformation 
process in which external DNA is taken up by a cell 


DNA Structure and Sequencing 
By the end of this section, you will be able to: 


e Describe the structure of DNA 

e Explain the Sanger method of DNA sequencing 

e Discuss the similarities and differences between eukaryotic and 
prokaryotic DNA 


The building blocks of DNA are nucleotides. The important components of 
the nucleotide are a nitrogenous base, deoxyribose (5-carbon sugar), and a 
phosphate group ((link]). The nucleotide is named depending on the 
nitrogenous base. The nitrogenous base can be a purine such as adenine (A) 
and guanine (G), or a pyrimidine such as cytosine (C) and thymine (T). 


Pyrimidines 
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Each nucleotide is made up of a sugar, a phosphate 
group, and a nitrogenous base. The sugar is deoxyribose 
in DNA and ribose in RNA. 


The nucleotides combine with each other by covalent bonds known as 
phosphodiester bonds or linkages. The purines have a double ring structure 
with a six-membered ring fused to a five-membered ring. Pyrimidines are 
smaller in size; they have a single six-membered ring structure. The carbon 
atoms of the five-carbon sugar are numbered 1’, 2', 3', 4', and 5' (1' is read as 
“one prime”). The phosphate residue is attached to the hydroxyl group of 
the 5' carbon of one sugar of one nucleotide and the hydroxyl group of the 


3' carbon of the sugar of the next nucleotide, thereby forming a 5'-3' 
phosphodiester bond. 


In the 1950s, Francis Crick and James Watson worked together to determine 
the structure of DNA at the University of Cambridge, England. Other 
scientists like Linus Pauling and Maurice Wilkins were also actively 
exploring this field. Pauling had discovered the secondary structure of 
proteins using X-ray crystallography. In Wilkins’ lab, researcher Rosalind 
Franklin was using X-ray diffraction methods to understand the structure of 
DNA. Watson and Crick were able to piece together the puzzle of the DNA 
molecule on the basis of Franklin's data because Crick had also studied X- 
ray diffraction ([link]). In 1962, James Watson, Francis Crick, and Maurice 
Wilkins were awarded the Nobel Prize in Medicine. Unfortunately, by then 
Franklin had died, and Nobel prizes are not awarded posthumously. 


(a) (b) 


The work of pioneering scientists (a) James Watson, 
Francis Crick, and Maclyn McCarty led to our present 
day understanding of DNA. Scientist Rosalind Franklin 
discovered (b) the X-ray diffraction pattern of DNA, 
which helped to elucidate its double helix structure. 
(credit a: modification of work by Marjorie McCarty, 
Public Library of Science) 


Watson and Crick proposed that DNA is made up of two strands that are 
twisted around each other to form a right-handed helix. Base pairing takes 
place between a purine and pyrimidine; namely, A pairs with T and G pairs 
with C. Adenine and thymine are complementary base pairs, and cytosine 
and guanine are also complementary base pairs. The base pairs are 
stabilized by hydrogen bonds; adenine and thymine form two hydrogen 
bonds and cytosine and guanine form three hydrogen bonds. The two 
strands are anti-parallel in nature; that is, the 3' end of one strand faces the 
5' end of the other strand. The sugar and phosphate of the nucleotides form 
the backbone of the structure, whereas the nitrogenous bases are stacked 
inside. Each base pair is separated from the other base pair by a distance of 
0.34 nm, and each turn of the helix measures 3.4 nm. Therefore, ten base 
pairs are present per turn of the helix. The diameter of the DNA double 
helix is 2 nm, and it is uniform throughout. Only the pairing between a 
purine and pyrimidine can explain the uniform diameter. The twisting of the 
two strands around each other results in the formation of uniformly spaced 


major and minor grooves ((link]). 
(a) (b) 
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DNA has (a) a double helix structure and (b) 
phosphodiester bonds. The (c) major and minor 
grooves are binding sites for DNA binding 
proteins during processes such as transcription (the 
copying of RNA from DNA) and replication. 


DNA Sequencing Techniques 


Until the 1990s, the sequencing of DNA (reading the sequence of DNA) 
was a relatively expensive and long process. Using radiolabeled nucleotides 


also compounded the problem through safety concerns. With currently 
available technology and automated machines, the process is cheap, safer, 
and can be completed in a matter of hours. Fred Sanger developed the 
sequencing method used for the human genome sequencing project, which 
is widely used today ([link]). 


Note: 
Link to Learning 
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Visit this site to watch a video explaining the DNA sequence reading 
technique that resulted from Sanger’s work. 


The method is known as the dideoxy chain termination method. The 
sequencing method is based on the use of chain terminators, the 
dideoxynucleotides (ddNTPs). The dideoxynucleotides, or ddNTPSs, differ 
from the deoxynucleotides by the lack of a free 3' OH group on the five- 
carbon sugar. If a ddNTP is added to a growing a DNA strand, the chain is 
not extended any further because the free 3' OH group needed to add 
another nucleotide is not available. By using a predetermined ratio of 
deoxyribonucleotides to dideoxynucleotides, it is possible to generate DNA 
fragments of different sizes. 


120 130 


Dye-labeled dideoxynucleotides are used to 
generate DNA fragments of different lengths GAT AAAT CT GGTCTTATTTCC 


In Frederick Sanger's dideoxy chain 
termination method, dye-labeled 
dideoxynucleotides are used to generate 
DNA fragments that terminate at different 
points. The DNA is separated by capillary 
electrophoresis on the basis of size, and 
from the order of fragments formed, the 
DNA sequence can be read. The DNA 
sequence readout is shown on an 
electropherogram that is generated by a laser 
scanner. 


The DNA sample to be sequenced is denatured or separated into two 
strands by heating it to high temperatures. The DNA is divided into four 
tubes in which a primer, DNA polymerase, and all four nucleotides (A, T, 
G, and C) are added. In addition to each of the four tubes, limited quantities 
of one of the four dideoxynucleotides are added to each tube respectively. 
The tubes are labeled as A, T, G, and C according to the ddNTP added. For 
detection purposes, each of the four dideoxynucleotides carries a different 
fluorescent label. Chain elongation continues until a fluorescent dideoxy 
nucleotide is incorporated, after which no further elongation takes place. 
After the reaction is over, electrophoresis is performed. Even a difference in 
length of a single base can be detected. The sequence is read from a laser 
scanner. For his work on DNA sequencing, Sanger received a Nobel Prize 
in chemistry in 1980. 


Note: 
Link to Learning 
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Sanger’s genome sequencing has led to a race to sequence human genomes 
at a rapid speed and low cost, often referred to as the $1000 in one day 
sequence. Learn more by selecting the Sequencing at Speed animation 
here. 


Gel electrophoresis is a technique used to separate DNA fragments of 
different sizes. Usually the gel is made of a chemical called agarose. 
Agarose powder is added to a buffer and heated. After cooling, the gel 
solution is poured into a casting tray. Once the gel has solidified, the DNA 
is loaded on the gel and electric current is applied. The DNA has a net 
negative charge and moves from the negative electrode toward the positive 
electrode. The electric current is applied for sufficient time to let the DNA 
separate according to size; the smallest fragments will be farthest from the 
well (where the DNA was loaded), and the heavier molecular weight 
fragments will be closest to the well. Once the DNA is separated, the gel is 
stained with a DNA-specific dye for viewing it ({link]). 


DNA can be separated on the basis 
of size using gel electrophoresis. 
(credit: James Jacob, Tompkins 

Cortland Community College) 


Note: 

Evolution Connection 

Neanderthal Genome: How Are We Related? 

The first draft sequence of the Neanderthal genome was recently published 
by Richard E. Green et al. in 2010.!f0™ote] Neanderthals are the closest 
ancestors of present-day humans. They were known to have lived in 
Europe and Western Asia before they disappeared from fossil records 
approximately 30,000 years ago. Green’s team studied almost 40,000-year- 
old fossil remains that were selected from sites across the world. Extremely 


sophisticated means of sample preparation and DNA sequencing were 
employed because of the fragile nature of the bones and heavy microbial 
contamination. In their study, the scientists were able to sequence some 
four billion base pairs. The Neanderthal sequence was compared with that 
of present-day humans from across the world. After comparing the 
sequences, the researchers found that the Neanderthal genome had 2 to 3 
percent greater similarity to people living outside Africa than to people in 
Africa. While current theories have suggested that all present-day humans 
can be traced to a small ancestral population in Africa, the data from the 
Neanderthal genome may contradict this view. Green and his colleagues 
also discovered DNA segments among people in Europe and Asia that are 
more similar to Neanderthal sequences than to other contemporary human 
sequences. Another interesting observation was that Neanderthals are as 
closely related to people from Papua New Guinea as to those from China 
or France. This is surprising because Neanderthal fossil remains have been 
located only in Europe and West Asia. Most likely, genetic exchange took 
place between Neanderthals and modern humans as modern humans 
emerged out of Africa, before the divergence of Europeans, East Asians, 
and Papua New Guineans. 

Richard E. Green et al., “A Draft Sequence of the Neandertal Genome,” 
Science 328 (2010): 710-22. 

Several genes seem to have undergone changes from Neanderthals during 
the evolution of present-day humans. These genes are involved in cranial 
structure, metabolism, skin morphology, and cognitive development. One 
of the genes that is of particular interest is RUNX2, which is different in 
modern day humans and Neanderthals. This gene is responsible for the 
prominent frontal bone, bell-shaped rib cage, and dental differences seen in 
Neanderthals. It is speculated that an evolutionary change in RUNX2 was 
important in the origin of modern-day humans, and this affected the 
cranium and the upper body. 


Note: 
Link to Learning 
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Watch Svante Padbo’s talk explaining the Neanderthal genome research at 
the 2011 annual TED (Technology, Entertainment, Design) conference. 


DNA Packaging in Cells 


When comparing prokaryotic cells to eukaryotic cells, prokaryotes are 
much simpler than eukaryotes in many of their features ([link]). Most 
prokaryotes contain a single, circular chromosome that is found in an area 
of the cytoplasm called the nucleoid. 


Note: 
Art Connection 


Nucleoid 
(folded 
chromosome) 


Eukaryote Prokaryote 


A eukaryote contains a well-defined nucleus, 
whereas in prokaryotes, the chromosome lies in 
the cytoplasm in an area called the nucleoid. 


In eukaryotic cells, DNA and RNA synthesis occur in a separate 
compartment from protein synthesis. In prokaryotic cells, both processes 
occur together. What advantages might there be to separating the 
processes? What advantages might there be to having them occur together? 


The size of the genome in one of the most well-studied prokaryotes, E.coli, 
is 4.6 million base pairs (approximately 1.1 mm, if cut and stretched out). 
So how does this fit inside a small bacterial cell? The DNA is twisted by 
what is known as supercoiling. Supercoiling means that DNA is either 
under-wound (less than one turn of the helix per 10 base pairs) or over- 
wound (more than 1 turn per 10 base pairs) from its normal relaxed state. 
Some proteins are known to be involved in the supercoiling; other proteins 
and enzymes such as DNA gyrase help in maintaining the supercoiled 
structure. 


Eukaryotes, whose chromosomes each consist of a linear DNA molecule, 
employ a different type of packing strategy to fit their DNA inside the 
nucleus ({link]). At the most basic level, DNA is wrapped around proteins 
known as histones to form structures called nucleosomes. The histones are 
evolutionarily conserved proteins that are rich in basic amino acids and 
form an octamer. The DNA (which is negatively charged because of the 
phosphate groups) is wrapped tightly around the histone core. This 
nucleosome is linked to the next one with the help of a linker DNA. This is 
also known as the “beads on a string” structure. This is further compacted 
into a 30 nm fiber, which is the diameter of the structure. At the metaphase 
stage, the chromosomes are at their most compact, are approximately 700 
nm in width, and are found in association with scaffold proteins. 


In interphase, eukaryotic chromosomes have two distinct regions that can 
be distinguished by staining. The tightly packaged region is known as 
heterochromatin, and the less dense region is known as euchromatin. 
Heterochromatin usually contains genes that are not expressed, and is found 
in the regions of the centromere and telomeres. The euchromatin usually 
contains genes that are transcribed, with DNA packaged around 
nucleosomes but not further compacted. 


Organization of Eukaryotic Chromosomes 


DNA double 
helix 


DNA wrapped 
around histone 


Nucleosomes 
coiled into a 
chromatin 
fiber 


Further 
condensation 
of chromatin 


Duplicated 
chromosome 


These figures illustrate the 
compaction of the eukaryotic 
chromosome. 


Section Summary 


The currently accepted model of the double-helix structure of DNA was 
proposed by Watson and Crick. Some of the salient features are that the two 
strands that make up the double helix are complementary and anti-parallel 
in nature. Deoxyribose sugars and phosphates form the backbone of the 
structure, and the nitrogenous bases are stacked inside. The diameter of the 
double helix, 2 nm, is uniform throughout. A purine always pairs with a 
pyrimidine; A pairs with T, and G pairs with C. One turn of the helix has 
ten base pairs. During cell division, each daughter cell receives a copy of 


the DNA by a process known as DNA replication. Prokaryotes are much 
simpler than eukaryotes in many of their features. Most prokaryotes contain 
a single, circular chromosome. In general, eukaryotic chromosomes contain 
a linear DNA molecule packaged into nucleosomes, and have two distinct 
regions that can be distinguished by staining, reflecting different states of 
packaging and compaction. 


Art Connections 


Exercise: 


Problem: 


[link] In eukaryotic cells, DNA and RNA synthesis occur in a separate 
compartment from protein synthesis. In prokaryotic cells, both 
processes occur together. What advantages might there be to 
separating the processes? What advantages might there be to having 
them occur together? 


Solution: 


[link] Compartmentalization enables a eukaryotic cell to divide 
processes into discrete steps so it can build more complex protein and 
RNA products. But there is an advantage to having a single 
compartment as well: RNA and protein synthesis occurs much more 
quickly in a prokaryotic cell. 


Review Questions 


Exercise: 


Problem:DNA double helix does not have which of the following? 


a. antiparallel configuration 

b. complementary base pairing 
c. major and minor grooves 

d. uracil 


Solution: 


D 


Exercise: 


Problem:In eukaryotes, what is the DNA wrapped around? 


a. single-stranded binding proteins 
b. sliding clamp 

c. polymerase 

d. histones 


Solution: 


D 


Free Response 


Exercise: 


Problem: Provide a brief summary of the Sanger sequencing method. 


Solution: 


The template DNA strand is mixed with a DNA polymerase, a primer, 
the 4 deoxynucleotides, and a limiting concentration of 4 
dideoxynucleotides. DNA polymerase synthesizes a strand 
complementary to the template. Incorporation of ddNTPs at different 
locations results in DNA fragments that have terminated at every 
possible base in the template. These fragments are separated by gel 
electrophoresis and visualized by a laser detector to determine the 
sequence of bases. 


Exercise: 


Problem: 

Describe the structure and complementary base pairing of DNA. 
Solution: 

DNA has two strands in anti-parallel orientation. The sugar-phosphate 
linkages form a backbone on the outside, and the bases are paired on 
the inside: A with T, and G with C, like rungs on a spiral ladder. 


Glossary 


electrophoresis 
technique used to separate DNA fragments according to size 


Basics of DNA Replication 
By the end of this section, you will be able to: 


e Explain how the structure of DNA reveals the replication process 
e Describe the Meselson and Stahl experiments 


The elucidation of the structure of the double helix provided a hint as to 
how DNA divides and makes copies of itself. This model suggests that the 
two strands of the double helix separate during replication, and each strand 
serves as a template from which the new complementary strand is copied. 
What was not clear was how the replication took place. There were three 
models suggested ({link]): conservative, semi-conservative, and dispersive. 


Suggested Models of DNA Replication 


Conservative Semi-conservative Dispersive 


The three suggested models 
of DNA replication. Grey 
indicates the original DNA 
strands, and blue indicates 
newly synthesized DNA. 


In conservative replication, the parental DNA remains together, and the 
newly formed daughter strands are together. The semi-conservative method 


suggests that each of the two parental DNA strands act as a template for 
new DNA to be synthesized; after replication, each double-stranded DNA 
includes one parental or “old” strand and one “new” strand. In the 
dispersive model, both copies of DNA have double-stranded segments of 
parental DNA and newly synthesized DNA interspersed. 


Meselson and Stahl were interested in understanding how DNA replicates. 
They grew E. coli for several generations in a medium containing a “heavy” 
isotope of nitrogen (!°N) that gets incorporated into nitrogenous bases, and 
eventually into the DNA ((link]). 


14 generations of growth 


er 


Semi- 
conservative 
replication 


Ultracentrifuge 


Time (min.) 


Meselson and Stahl experimented 
with E. coli grown first in heavy 
nitrogen (/°N) then in '‘4N. DNA 
grown in !°N (red band) is heavier 
than DNA grown in 4N (orange 
band), and sediments to a lower level 
in cesium chloride solution in an 
ultracentrifuge. When DNA grown in 
'SN is switched to media containing 
14N) after one round of cell division 
the DNA sediments halfway between 
the °N and 4N levels, indicating that 
it now contains fifty percent ‘4N. In 


subsequent cell divisions, an 
increasing amount of DNA contains 
'4N only. This data supports the semi- 
conservative replication model. 
(credit: modification of work by 
Mariana Ruiz Villareal) 


The E. coli culture was then shifted into medium containing '4N and 
allowed to grow for one generation. The cells were harvested and the DNA 
was isolated. The DNA was centrifuged at high speeds in an ultracentrifuge. 
Some cells were allowed to grow for one more life cycle in ‘4N and spun 
again. During the density gradient centrifugation, the DNA is loaded into a 
gradient (typically a salt such as cesium chloride or sucrose) and spun at 
high speeds of 50,000 to 60,000 rpm. Under these circumstances, the DNA 
will form a band according to its density in the gradient. DNA grown in °N 
will band at a higher density position than that grown in '4N. Meselson and 
Stahl noted that after one generation of growth in !4N after they had been 
shifted from !°N, the single band observed was intermediate in position in 
between DNA of cells grown exclusively in °N and '4N. This suggested 
either a semi-conservative or dispersive mode of replication. The DNA 
harvested from cells grown for two generations in ‘4N formed two bands: 
one DNA band was at the intermediate position between !°N and '4N, and 
the other corresponded to the band of ‘4N DNA. These results could only be 
explained if DNA replicates in a semi-conservative manner. Therefore, the 
other two modes were ruled out. 


During DNA replication, each of the two strands that make up the double 
helix serves as a template from which new strands are copied. The new 
strand will be complementary to the parental or “old” strand. When two 
daughter DNA copies are formed, they have the same sequence and are 
divided equally into the two daughter cells. 


Note: 


Link to Learning 
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Click through this tutorial on DNA replication. 


Section Summary 


The model for DNA replication suggests that the two strands of the double 
helix separate during replication, and each strand serves as a template from 
which the new complementary strand is copied. In conservative replication, 
the parental DNA is conserved, and the daughter DNA is newly 
synthesized. The semi-conservative method suggests that each of the two 
parental DNA strands acts as template for new DNA to be synthesized; 
after replication, each double-stranded DNA includes one parental or “old” 
strand and one “new” strand. The dispersive mode suggested that the two 
copies of the DNA would have segments of parental DNA and newly 
synthesized DNA. 


Review Questions 


Exercise: 


Problem: 


Meselson and Stahl's experiments proved that DNA replicates by 
which mode? 


a. conservative 

b. semi-conservative 
c. dispersive 

d. none of the above 


Solution: 


B 
Exercise: 


Problem: 


If the sequence of the 5'-3' strand is AATGCTAC, then the 
complementary sequence has the following sequence: 


a. 3'-AATGCTAC-5' 
b. 3'-CATCGTAA-S' 
c. 3'-TTACGATG-S' 
d. 3'-GTAGCATT-5' 


Solution: 


‘6 


Free Response 


Exercise: 


Problem: 


How did the scientific community learn that DNA replication takes 
place in a semi-conservative fashion? 


Solution: 


Meselson’s experiments with E. coli grown in '!°N deduced this 
finding. 


DNA Replication in Prokaryotes 
By the end of this section, you will be able to: 


e Explain the process of DNA replication in prokaryotes 
e Discuss the role of different enzymes and proteins in supporting this 
process 


DNA replication has been extremely well studied in prokaryotes primarily 
because of the small size of the genome and the mutants that are available. 
E. coli has 4.6 million base pairs in a single circular chromosome and all of 
it gets replicated in approximately 42 minutes, starting from a single origin 
of replication and proceeding around the circle in both directions. This 
means that approximately 1000 nucleotides are added per second. The 
process is quite rapid and occurs without many mistakes. 


DNA replication employs a large number of proteins and enzymes, each of 
which plays a critical role during the process. One of the key players is the 
enzyme DNA polymerase, also known as DNA pol, which adds nucleotides 
one by one to the growing DNA chain that are complementary to the 
template strand. The addition of nucleotides requires energy; this energy is 
obtained from the nucleotides that have three phosphates attached to them, 
similar to ATP which has three phosphate groups attached. When the bond 
between the phosphates is broken, the energy released is used to form the 
phosphodiester bond between the incoming nucleotide and the growing 
chain. In prokaryotes, three main types of polymerases are known: DNA 
pol I, DNA pol II, and DNA pol III. It is now known that DNA pol III is the 
enzyme required for DNA synthesis; DNA pol I and DNA pol II are 
primarily required for repair. 


How does the replication machinery know where to begin? It turns out that 
there are specific nucleotide sequences called origins of replication where 
replication begins. In E. coli, which has a single origin of replication on its 
one chromosome (as do most prokaryotes), it is approximately 245 base 
pairs long and is rich in AT sequences. The origin of replication is 
recognized by certain proteins that bind to this site. An enzyme called 
helicase unwinds the DNA by breaking the hydrogen bonds between the 
nitrogenous base pairs. ATP hydrolysis is required for this process. As the 
DNA opens up, Y-shaped structures called replication forks are formed. 


Two replication forks are formed at the origin of replication and these get 
extended bi- directionally as replication proceeds. Single-strand binding 
proteins coat the single strands of DNA near the replication fork to prevent 
the single-stranded DNA from winding back into a double helix. DNA 
polymerase is able to add nucleotides only in the 5' to 3' direction (a new 
DNA strand can be only extended in this direction). It also requires a free 
3'-OH group to which it can add nucleotides by forming a phosphodiester 
bond between the 3'-OH end and the 5' phosphate of the next nucleotide. 
This essentially means that it cannot add nucleotides if a free 3'-OH group 
is not available. Then how does it add the first nucleotide? The problem is 
solved with the help of a primer that provides the free 3'-OH end. Another 
enzyme, RNA primase, synthesizes an RNA primer that is about five to ten 
nucleotides long and complementary to the DNA. Because this sequence 
primes the DNA synthesis, it is appropriately called the primer. DNA 
polymerase can now extend this RNA primer, adding nucleotides one by 
one that are complementary to the template strand ((link]). 


Note: 
Art Connection 
DNA polymerase | DNA polymerase III 
DNA ligase Primase RNAprimer 


5! 
Lagging 
strand 


Leading 5 
strand 3 


Topoisomerase 


Single-strand 
DNA polymerase IIl_ —_ binding protein 


A replication fork is formed when helicase 
separates the DNA strands at the origin of 
replication. The DNA tends to become more 
highly coiled ahead of the replication fork. 
Topoisomerase breaks and reforms DNA’s 
phosphate backbone ahead of the replication 
fork, thereby relieving the pressure that 


results from this supercoiling. Single-strand 
binding proteins bind to the single-stranded 
DNA to prevent the helix from re-forming. 
Primase synthesizes an RNA primer. DNA 
polymerase III uses this primer to synthesize 
the daughter DNA strand. On the leading 
strand, DNA is synthesized continuously, 
whereas on the lagging strand, DNA is 
synthesized in short stretches called Okazaki 
fragments. DNA polymerase I replaces the 
RNA primer with DNA. DNA ligase seals 
the gaps between the Okazaki fragments, 
joining the fragments into a single DNA 
molecule. (credit: modification of work by 
Mariana Ruiz Villareal) 


You isolate a cell strain in which the joining together of Okazaki fragments 
is impaired and suspect that a mutation has occurred in an enzyme found at 
the replication fork. Which enzyme is most likely to be mutated? 


The replication fork moves at the rate of 1000 nucleotides per second. DNA 
polymerase can only extend in the 5' to 3' direction, which poses a slight 
problem at the replication fork. As we know, the DNA double helix is anti- 
parallel; that is, one strand is in the 5' to 3' direction and the other is 
oriented in the 3' to 5' direction. One strand, which is complementary to the 
3' to 5' parental DNA strand, is synthesized continuously towards the 
replication fork because the polymerase can add nucleotides in this 
direction. This continuously synthesized strand is known as the leading 
strand. The other strand, complementary to the 5' to 3' parental DNA, is 
extended away from the replication fork, in small fragments known as 
Okazaki fragments, each requiring a primer to start the synthesis. Okazaki 
fragments are named after the Japanese scientist who first discovered them. 
The strand with the Okazaki fragments is known as the lagging strand. 


The leading strand can be extended by one primer alone, whereas the 
lagging strand needs a new primer for each of the short Okazaki fragments. 
The overall direction of the lagging strand will be 3' to 5’, and that of the 
leading strand 5' to 3'. A protein called the sliding clamp holds the DNA 
polymerase in place as it continues to add nucleotides. The sliding clamp is 
a ring-shaped protein that binds to the DNA and holds the polymerase in 
place. Topoisomerase prevents the over-winding of the DNA double helix 
ahead of the replication fork as the DNA is opening up; it does so by 
causing temporary nicks in the DNA helix and then resealing it. As 
synthesis proceeds, the RNA primers are replaced by DNA. The primers are 
removed by the exonuclease activity of DNA pol I, and the gaps are filled 
in by deoxyribonucleotides. The nicks that remain between the newly 
synthesized DNA (that replaced the RNA primer) and the previously 
synthesized DNA are sealed by the enzyme DNA ligase that catalyzes the 
formation of phosphodiester linkage between the 3'-OH end of one 
nucleotide and the 5' phosphate end of the other fragment. 


Once the chromosome has been completely replicated, the two DNA copies 
move into two different cells during cell division. The process of DNA 
replication can be summarized as follows: 


1. DNA unwinds at the origin of replication. 
2. Helicase opens up the DNA-forming replication forks; these are 
extended bidirectionally. 
3. Single-strand binding proteins coat the DNA around the replication 
fork to prevent rewinding of the DNA. 
4. Topoisomerase binds at the region ahead of the replication fork to 
prevent supercoiling. 
5. Primase synthesizes RNA primers complementary to the DNA strand. 
6. DNA polymerase starts adding nucleotides to the 3'-OH end of the 
primer. 
. Elongation of both the lagging and the leading strand continues. 
RNA primers are removed by exonuclease activity. 
. Gaps are filled by DNA pol by adding dNTPs. 
. The gap between the two DNA fragments is sealed by DNA ligase, 
which helps in the formation of phosphodiester bonds. 


COW AON 


[link] summarizes the enzymes involved in prokaryotic DNA replication 
and the functions of each. 


Prokaryotic DNA Replication: Enzymes and Their Function 


Enzyme/protein Specific Function 
Exonuclease activity removes RNA primer and 
penn replaces with newly synthesized DNA 
DNA pol II Repair function 
DNA pol III Main enzyme that adds nucleotides in the 5'-3 


direction 


Opens the DNA helix by breaking hydrogen 


Bencane bonds between the nitrogenous bases 
Li Seals the gaps between the Okazaki fragments 
igase ; 
to create one continuous DNA strand 
Pri Synthesizes RNA primers needed to start 
rimase ae 
replication 
Sliding Clamp Helps to hold the DNA polymerase in place 


when nucleotides are being added 


Helps relieve the stress on DNA when 
Topoisomerase unwinding by causing breaks and then 
resealing the DNA 


Prokaryotic DNA Replication: Enzymes and Their Function 
Enzyme/protein Specific Function 
Single-strand 


binding proteins 
(SSB) 


Binds to single-stranded DNA to avoid DNA 
rewinding back. 
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Review the full process of DNA replication here. 


Section Summary 


Replication in prokaryotes starts from a sequence found on the chromosome 
called the origin of replication—the point at which the DNA opens up. 
Helicase opens up the DNA double helix, resulting in the formation of the 
replication fork. Single-strand binding proteins bind to the single-stranded 
DNA near the replication fork to keep the fork open. Primase synthesizes an 
RNA primer to initiate synthesis by DNA polymerase, which can add 
nucleotides only in the 5' to 3' direction. One strand is synthesized 
continuously in the direction of the replication fork; this is called the 
leading strand. The other strand is synthesized in a direction away from the 
replication fork, in short stretches of DNA known as Okazaki fragments. 
This strand is known as the lagging strand. Once replication is completed, 
the RNA primers are replaced by DNA nucleotides and the DNA is sealed 


with DNA ligase, which creates phosphodiester bonds between the 3'-OH of 
one end and the 5' phosphate of the other strand. 


Art Connections 


Exercise: 


Problem: 


[link] You isolate a cell strain in which the joining together of Okazaki 
fragments is impaired and suspect that a mutation has occurred in an 
enzyme found at the replication fork. Which enzyme is most likely to 
be mutated? 


Solution: 


[link] DNA ligase, as this enzyme joins together Okazaki fragments. 


Review Questions 


Exercise: 


Problem: 


Which of the following components is not involved during the 
formation of the replication fork? 


a. single-strand binding proteins 
b. helicase 

c. origin of replication 

d. ligase 


Solution: 


D 


Exercise: 


Problem: Which of the following does the enzyme primase synthesize? 


a. DNA primer 

b. RNA primer 

c. Okazaki fragments 

d. phosphodiester linkage 


Solution: 
B 
Exercise: 
Problem:In which direction does DNA replication take place? 
a. 5'-3' 
b. 3'-5' 


Cos 
d.3: 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


DNA replication is bidirectional and discontinuous; explain your 
understanding of those concepts. 


Solution: 


At an origin of replication, two replication forks are formed that are 
extended in two directions. On the lagging strand, Okazaki fragments 
are formed in a discontinuous manner. 


Exercise: 


Problem: What are Okazaki fragments and how they are formed? 
Solution: 


Short DNA fragments are formed on the lagging strand synthesized in 
a direction away from the replication fork. These are synthesized by 
DNA pol. 


Exercise: 
Problem: 
If the rate of replication in a particular prokaryote is 900 nucleotides 


per second, how long would it take 1.2 million base pair genomes to 
make two copies? 


Solution: 


1333 seconds or 22.2 minutes. 
Exercise: 
Problem: 


Explain the events taking place at the replication fork. If the gene for 
helicase is mutated, what part of replication will be affected? 


Solution: 


At the replication fork, the events taking place are helicase action, 
binding of single-strand binding proteins, primer synthesis, and 
synthesis of new strands. If there is a mutated helicase gene, the 
replication fork will not be extended. 


Exercise: 


Problem: 


What is the role of a primer in DNA replication? What would happen 
if you forgot to add a primer in a tube containing the reaction mix for a 
DNA sequencing reaction? 


Solution: 


Primer provides a 3'-OH group for DNA pol to start adding 
nucleotides. There would be no reaction in the tube without a primer, 
and no bands would be visible on the electrophoresis. 


Glossary 


helicase 
during replication, this enzyme helps to open up the DNA helix by 
breaking the hydrogen bonds 


lagging strand 
during replication, the strand that is replicated in short fragments and 
away from the replication fork 


leading strand 
strand that is synthesized continuously in the 5'-3' direction which is 
synthesized in the direction of the replication fork 


ligase 
enzyme that catalyzes the formation of a phosphodiester linkage 
between the 3' OH and 5' phosphate ends of the DNA 


Okazaki fragment 
DNA fragment that is synthesized in short stretches on the lagging 
strand 


primase 
enzyme that synthesizes the RNA primer; the primer is needed for 
DNA pol to start synthesis of anew DNA strand 


primer 
short stretch of nucleotides that is required to initiate replication; in the 
case of replication, the primer has RNA nucleotides 


replication fork 
Y-shaped structure formed during initiation of replication 


single-strand binding protein 
during replication, protein that binds to the single-stranded DNA; this 
helps in keeping the two strands of DNA apart so that they may serve 
as templates 


sliding clamp 
ring-shaped protein that holds the DNA pol on the DNA strand 


topoisomerase 
enzyme that causes underwinding or overwinding of DNA when DNA 
replication is taking place 


DNA Replication in Eukaryotes 
By the end of this section, you will be able to: 


e Discuss the similarities and differences between DNA replication in 
eukaryotes and prokaryotes 
e State the role of telomerase in DNA replication 


Eukaryotic genomes are much more complex and larger in size than 
prokaryotic genomes. The human genome has three billion base pairs per 
haploid set of chromosomes, and 6 billion base pairs are replicated during 
the S phase of the cell cycle. There are multiple origins of replication on the 
eukaryotic chromosome; humans can have up to 100,000 origins of 
replication. The rate of replication is approximately 100 nucleotides per 
second, much slower than prokaryotic replication. In yeast, which is a 
eukaryote, special sequences known as Autonomously Replicating 
Sequences (ARS) are found on the chromosomes. These are equivalent to 
the origin of replication in E. coli. 


The number of DNA polymerases in eukaryotes is much more than 
prokaryotes: 14 are known, of which five are known to have major roles 
during replication and have been well studied. They are known as pol a, pol 
B, pol y, pol 6, and pol e. 


The essential steps of replication are the same as in prokaryotes. Before 
replication can start, the DNA has to be made available as template. 
Eukaryotic DNA is bound to basic proteins known as histones to form 
structures called nucleosomes. The chromatin (the complex between DNA 
and proteins) may undergo some chemical modifications, so that the DNA 
may be able to slide off the proteins or be accessible to the enzymes of the 
DNA replication machinery. At the origin of replication, a pre-replication 
complex is made with other initiator proteins. Other proteins are then 
recruited to start the replication process ({link]). 


A helicase using the energy from ATP hydrolysis opens up the DNA helix. 
Replication forks are formed at each replication origin as the DNA 
unwinds. The opening of the double helix causes over-winding, or 
supercoiling, in the DNA ahead of the replication fork. These are resolved 
with the action of topoisomerases. Primers are formed by the enzyme 


primase, and using the primer, DNA pol can start synthesis. While the 
leading strand is continuously synthesized by the enzyme pol 6, the lagging 
strand is synthesized by pol e. A sliding clamp protein known as PCNA 
(Proliferating Cell Nuclear Antigen) holds the DNA pol in place so that it 
does not slide off the DNA. RNase H removes the RNA primer, which is 
then replaced with DNA nucleotides. The Okazaki fragments in the lagging 
strand are joined together after the replacement of the RNA primers with 
DNA. The gaps that remain are sealed by DNA ligase, which forms the 
phosphodiester bond. 


Telomere replication 


Unlike prokaryotic chromosomes, eukaryotic chromosomes are linear. As 
you’ve learned, the enzyme DNA pol can add nucleotides only in the 5' to 
3' direction. In the leading strand, synthesis continues until the end of the 
chromosome is reached. On the lagging strand, DNA is synthesized in short 
stretches, each of which is initiated by a separate primer. When the 
replication fork reaches the end of the linear chromosome, there is no place 
for a primer to be made for the DNA fragment to be copied at the end of the 
chromosome. These ends thus remain unpaired, and over time these ends 
may get progressively shorter as cells continue to divide. 


The ends of the linear chromosomes are known as telomeres, which have 
repetitive sequences that code for no particular gene. In a way, these 
telomeres protect the genes from getting deleted as cells continue to divide. 
In humans, a six base pair sequence, TTAGGG, is repeated 100 to 1000 
times. The discovery of the enzyme telomerase ([link]) helped in the 
understanding of how chromosome ends are maintained. The telomerase 
enzyme contains a catalytic part and a built-in RNA template. It attaches to 
the end of the chromosome, and complementary bases to the RNA template 
are added on the 3' end of the DNA strand. Once the 3' end of the lagging 
strand template is sufficiently elongated, DNA polymerase can add the 
nucleotides complementary to the ends of the chromosomes. Thus, the ends 
of the chromosomes are replicated. 
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telomerase 


Telomerase has an associated RNA that complements 
the 3’ overhang at the end of the chromosome. 
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The RNA template is used to synthesize the complementay 
strand. \ 
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Telomerase shifts, and the process is repeated. 
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3’ cheeleleelels 5’ Mt dalle 
Primase and DNA polymerase syntesize the complementary 
strand. 


The ends of linear 
chromosomes are maintained by 
the action of the telomerase 
enzyme. 


Telomerase is typically active in germ cells and adult stem cells. It is not 
active in adult somatic cells. For her discovery of telomerase and its action, 
Elizabeth Blackburn ([link]) received the Nobel Prize for Medicine and 
Physiology in 2009. 


Elizabeth Blackburn, 2009 Nobel 
Laureate, is the scientist who 
discovered how telomerase works. 
(credit: US Embassy Sweden) 


Telomerase and Aging 


Cells that undergo cell division continue to have their telomeres shortened 
because most somatic cells do not make telomerase. This essentially means 
that telomere shortening is associated with aging. With the advent of 
modern medicine, preventative health care, and healthier lifestyles, the 
human life span has increased, and there is an increasing demand for people 
to look younger and have a better quality of life as they grow older. 


In 2010, scientists found that telomerase can reverse some age-related 
conditions in mice. This may have potential in regenerative medicine. 
[footnote] Te]gmerase-deficient mice were used in these studies; these mice 
have tissue atrophy, stem cell depletion, organ system failure, and impaired 
tissue injury responses. Telomerase reactivation in these mice caused 
extension of telomeres, reduced DNA damage, reversed neurodegeneration, 
and improved the function of the testes, spleen, and intestines. Thus, 


telomere reactivation may have potential for treating age-related diseases in 
humans. 

Jaskelioff et al., “Telomerase reactivation reverses tissue degeneration in 
aged telomerase-deficient mice,” Nature 469 (2011): 102-7. 


Cancer is characterized by uncontrolled cell division of abnormal cells. The 
cells accumulate mutations, proliferate uncontrollably, and can migrate to 
different parts of the body through a process called metastasis. Scientists 
have observed that cancerous cells have considerably shortened telomeres 
and that telomerase is active in these cells. Interestingly, only after the 
telomeres were shortened in the cancer cells did the telomerase become 
active. If the action of telomerase in these cells can be inhibited by drugs 
during cancer therapy, then the cancerous cells could potentially be stopped 
from further division. 


Difference between Prokaryotic and Eukaryotic Replication 


Property Prokaryotes Eukaryotes 

Origin of replication Single Multiple 
are 1000 50 to 100 

Bale Otrepicanon nucleotides/s nucleotides/s 

DNA polymerase 5 14 

types 

Telomerase Not present Present 

RNA primer removal DNA pol I RNase H 


Strand elongation DNA pol III Pol 6, pol € 


Difference between Prokaryotic and Eukaryotic Replication 
Property Prokaryotes Eukaryotes 


Sliding clamp Sliding clamp PCNA 


Section Summary 


Replication in eukaryotes starts at multiple origins of replication. The 
mechanism is quite similar to prokaryotes. A primer is required to initiate 
synthesis, which is then extended by DNA polymerase as it adds 
nucleotides one by one to the growing chain. The leading strand is 
synthesized continuously, whereas the lagging strand is synthesized in short 
stretches called Okazaki fragments. The RNA primers are replaced with 
DNA nucleotides; the DNA remains one continuous strand by linking the 
DNA fragments with DNA ligase. The ends of the chromosomes pose a 
problem as polymerase is unable to extend them without a primer. 
Telomerase, an enzyme with an inbuilt RNA template, extends the ends by 
copying the RNA template and extending one end of the chromosome. 
DNA polymerase can then extend the DNA using the primer. In this way, 
the ends of the chromosomes are protected. 


Review Questions 


Exercise: 


Problem:The ends of the linear chromosomes are maintained by 


a. helicase 

b. primase 

c. DNA pol 
d. telomerase 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


How do the linear chromosomes in eukaryotes ensure that its ends are 
replicated completely? 


Solution: 


Telomerase has an inbuilt RNA template that extends the 3' end, so 
primer is synthesized and extended. Thus, the ends are protected. 


Glossary 
telomerase 
enzyme that contains a catalytic part and an inbuilt RNA template; it 


functions to maintain telomeres at chromosome ends 


telomere 
DNA at the end of linear chromosomes 


DNA Repair 
By the end of this section, you will be able to: 


e Discuss the different types of mutations in DNA 
e Explain DNA repair mechanisms 


DNA replication is a highly accurate process, but mistakes can occasionally 
occur, such as a DNA polymerase inserting a wrong base. Uncorrected 
mistakes may sometimes lead to serious consequences, such as cancer. 
Repair mechanisms correct the mistakes. In rare cases, mistakes are not 
corrected, leading to mutations; in other cases, repair enzymes are 
themselves mutated or defective. 


Most of the mistakes during DNA replication are promptly corrected by 
DNA polymerase by proofreading the base that has been just added ((Link]). 
In proofreading, the DNA pol reads the newly added base before adding 
the next one, so a correction can be made. The polymerase checks whether 
the newly added base has paired correctly with the base in the template 
strand. If it is the right base, the next nucleotide is added. If an incorrect 
base has been added, the enzyme makes a cut at the phosphodiester bond 
and releases the wrong nucleotide. This is performed by the exonuclease 
action of DNA pol III. Once the incorrect nucleotide has been removed, a 
new one will be added again. 
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DNA polymerase 


Proofreading by DNA polymerase 
corrects errors during replication. 


Some errors are not corrected during replication, but are instead corrected 
after replication is completed; this type of repair is known as mismatch 
repair ({link]). The enzymes recognize the incorrectly added nucleotide and 


excise it; this is then replaced by the correct base. If this remains 
uncorrected, it may lead to more permanent damage. How do mismatch 
repair enzymes recognize which of the two bases is the incorrect one? In E. 
coli, after replication, the nitrogenous base adenine acquires a methyl 
group; the parental DNA strand will have methyl groups, whereas the 
newly synthesized strand lacks them. Thus, DNA polymerase is able to 
remove the wrongly incorporated bases from the newly synthesized, non- 
methylated strand. In eukaryotes, the mechanism is not very well 
understood, but it is believed to involve recognition of unsealed nicks in the 
new strand, as well as a short-term continuing association of some of the 
replication proteins with the new daughter strand after replication has 
completed. 
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In mismatch repair, the 
incorrectly added base is 
detected after replication. 

The mismatch repair 
proteins detect this base 
and remove it from the 
newly synthesized strand 
by nuclease action. The 
gap is now filled with the 
correctly paired base. 


In another type of repair mechanism, nucleotide excision repair, enzymes 
replace incorrect bases by making a cut on both the 3' and 5' ends of the 
incorrect base ({link]). The segment of DNA is removed and replaced with 
the correctly paired nucleotides by the action of DNA pol. Once the bases 
are filled in, the remaining gap is sealed with a phosphodiester linkage 
catalyzed by DNA ligase. This repair mechanism is often employed when 
UV exposure causes the formation of pyrimidine dimers. 


Cc A Ter € TT 6G 
G TAAGAC 


Nucleotide excision 
repairs thymine 
dimers. When 
exposed to UV, 
thymines lying 
adjacent to each 
other can form 
thymine dimers. In 
normal cells, they 
are excised and 
replaced. 


A well-studied example of mistakes not being corrected is seen in people 
suffering from xeroderma pigmentosa ([link]). Affected individuals have 
skin that is highly sensitive to UV rays from the sun. When individuals are 
exposed to UV, pyrimidine dimers, especially those of thymine, are formed; 


people with xeroderma pigmentosa are not able to repair the damage. These 
are not repaired because of a defect in the nucleotide excision repair 
enzymes, whereas in normal individuals, the thymine dimers are excised 
and the defect is corrected. The thymine dimers distort the structure of the 
DNA double helix, and this may cause problems during DNA replication. 
People with xeroderma pigmentosa may have a higher risk of contracting 
skin cancer than those who dont have the condition. 


MA 


Xeroderma pigmentosa is a 
condition in which thymine 
dimerization from exposure to UV 
is not repaired. Exposure to 
sunlight results in skin lesions. 
(credit: James Halpern et al.) 


Errors during DNA replication are not the only reason why mutations arise 
in DNA. Mutations, variations in the nucleotide sequence of a genome, can 
also occur because of damage to DNA. Such mutations may be of two 
types: induced or spontaneous. Induced mutations are those that result 
from an exposure to chemicals, UV rays, x-rays, or some other 
environmental agent. Spontaneous mutations occur without any exposure 
to any environmental agent; they are a result of natural reactions taking 
place within the body. 


Mutations may have a wide range of effects. Some mutations are not 
expressed; these are known as silent mutations. Point mutations are those 
mutations that affect a single base pair. The most common nucleotide 
mutations are substitutions, in which one base is replaced by another. These 
can be of two types, either transitions or transversions. Transition 
substitution refers to a purine or pyrimidine being replaced by a base of the 
same kind; for example, a purine such as adenine may be replaced by the 
purine guanine. Transversion substitution refers to a purine being replaced 
by a pyrimidine, or vice versa; for example, cytosine, a pyrimidine, is 
replaced by adenine, a purine. Mutations can also be the result of the 
addition of a base, known as an insertion, or the removal of a base, also 
known as deletion. Sometimes a piece of DNA from one chromosome may 
get translocated to another chromosome or to another region of the same 
chromosome; this is also known as translocation. These mutation types are 
shown in [link]. 


Note: 
Art Connection 


Point Mutations 
Silent: has no effect on the protein sequence 


Joao 
AGEGTACCETAC mp AGCCTPCCCTAG 


Missense: results in an amino acid substitution 
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AGCGTACCCTAC => AGCGTAAC CCTAC AC 


Nonsense: substitutes a stop codon for an amino acid 


A A a a Ra a a OH 
AGCGTACCCTAC a> AGCGTACCET AG 
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Frameshift Mutations 


Insertions or deletions of nucleotides may result ina 
shift in the reading frame or insertion of a stop codon. 


Ss i aa 
AGCGTACCCTAC mpm AGCGCCCTACTT 


Mutations can lead to changes in 
the protein sequence encoded by 
the DNA. 


A frameshift mutation that results in the insertion of three nucleotides is 
often less deleterious than a mutation that results in the insertion of one 
nucleotide. Why? 


Mutations in repair genes have been known to cause cancer. Many mutated 
repair genes have been implicated in certain forms of pancreatic cancer, 
colon cancer, and colorectal cancer. Mutations can affect either somatic 
cells or germ cells. If many mutations accumulate in a somatic cell, they 
may lead to problems such as the uncontrolled cell division observed in 
cancer. If a mutation takes place in germ cells, the mutation will be passed 


on to the next generation, as in the case of hemophilia and xeroderma 
pigmentosa. 


Section Summary 


DNA polymerase can make mistakes while adding nucleotides. It edits the 
DNA by proofreading every newly added base. Incorrect bases are removed 
and replaced by the correct base, and then a new base is added. Most 
mistakes are corrected during replication, although when this does not 
happen, the mismatch repair mechanism is employed. Mismatch repair 
enzymes recognize the wrongly incorporated base and excise it from the 
DNA, replacing it with the correct base. In yet another type of repair, 
nucleotide excision repair, the incorrect base is removed along with a few 
bases on the 5' and 3' end, and these are replaced by copying the template 
with the help of DNA polymerase. The ends of the newly synthesized 
fragment are attached to the rest of the DNA using DNA ligase, which 
creates a phosphodiester bond. 


Most mistakes are corrected, and if they are not, they may result in a 
mutation defined as a permanent change in the DNA sequence. Mutations 
can be of many types, such as substitution, deletion, insertion, and 
translocation. Mutations in repair genes may lead to serious consequences 
such as cancer. Mutations can be induced or may occur spontaneously. 


Art Connections 


Exercise: 


Problem: 
[link] A frameshift mutation that results in the insertion of three 
nucleotides is often less deleterious than a mutation that results in the 


insertion of one nucleotide. Why? 


Solution: 


[link] If three nucleotides are added, one additional amino acid will be 
incorporated into the protein chain, but the reading frame wont shift. 


Review Questions 
Exercise: 
Problem: 


During proofreading, which of the following enzymes reads the DNA? 


a. primase 

b. topoisomerase 
c. DNA pol 

d. helicase 


Solution: 


C 
Exercise: 


Problem: 


The initial mechanism for repairing nucleotide errors in DNA is 


a. mismatch repair 

b. DNA polymerase proofreading 
c. nucleotide excision repair 

d. thymine dimers 


Solution: 


B 


Free Response 


Exercise: 


Problem: 


What is the consequence of mutation of a mismatch repair enzyme? 
How will this affect the function of a gene? 


Solution: 


Mutations are not repaired, as in the case of xeroderma pigmentosa. 
Gene function may be affected or it may not be expressed. 


Glossary 


induced mutation 
mutation that results from exposure to chemicals or environmental 
agents 


mutation 
variation in the nucleotide sequence of a genome 


mismatch repair 
type of repair mechanism in which mismatched bases are removed 
after replication 


nucleotide excision repair 
type of DNA repair mechanism in which the wrong base, along with a 
few nucleotides upstream or downstream, are removed 


proofreading 
function of DNA pol in which it reads the newly added base before 
adding the next one 


point mutation 
mutation that affects a single base 


silent mutation 
mutation that is not expressed 


spontaneous mutation 
mutation that takes place in the cells as a result of chemical reactions 
taking place naturally without exposure to any external agent 


transition substitution 
when a purine is replaced with a purine or a pyrimidine is replaced 
with another pyrimidine 


transversion substitution 
when a purine is replaced by a pyrimidine or a pyrimidine is replaced 
by a purine 


Introduction 
class="introduction' 


Genes, which 
are Carried on 
(a) 
chromosomes, 
are linearly 
organized 
instructions for 
making the 
RNA and 
protein 
molecules that 
are necessary 
for all of 
processes of 
life. The (b) 
interleukin-2 
protein and (c) 
alpha-2u- 
globulin 
protein are just 
two examples 
of the array of 
different 
molecular 
structures that 
are encoded by 
genes. (credit 
“chromosome: 
National 
Human 
Genome 
Research 
Institute; credit 
“interleukin- 


2”: Ramin 
Herati/Created 
from PDB 
1M47 and 
rendered with 
Pymol; credit 
“alpha-2u- 
globulin”: 
Darren 
Logan/rendere 
d with 
AISMIG) 


Since the rediscovery of Mendel’s work in 1900, the definition of the gene 
has progressed from an abstract unit of heredity to a tangible molecular 
entity capable of replication, expression, and mutation ({link]). Genes are 
composed of DNA and are linearly arranged on chromosomes. Genes 
specify the sequences of amino acids, which are the building blocks of 
proteins. In turn, proteins are responsible for orchestrating nearly every 
function of the cell. Both genes and the proteins they encode are absolutely 
essential to life as we know it. 


The Genetic Code 
By the end of this section, you will be able to: 


e Explain the “central dogma” of protein synthesis 
e Describe the genetic code and how the nucleotide sequence prescribes 
the amino acid and the protein sequence 


The cellular process of transcription generates messenger RNA (mRNA), a 
mobile molecular copy of one or more genes with an alphabet of A, C, G, 
and uracil (U). Translation of the mRNA template converts nucleotide- 
based genetic information into a protein product. Protein sequences consist 
of 20 commonly occurring amino acids; therefore, it can be said that the 
protein alphabet consists of 20 letters (({link]). Each amino acid is defined 
by a three-nucleotide sequence called the triplet codon. Different amino 
acids have different chemistries (such as acidic versus basic, or polar and 
nonpolar) and different structural constraints. Variation in amino acid 
sequence gives rise to enormous variation in protein structure and function. 
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Structures of the 20 amino acids found in proteins are 
shown. Each amino acid is composed of an amino group 
(NH; ), a carboxyl group (COO’), and a side chain 
(blue). The side chain may be nonpolar, polar, or 
charged, as well as large or small. It is the variety of 
amino acid side chains that gives rise to the incredible 
variation of protein structure and function. 


The Central Dogma: DNA Encodes RNA; RNA Encodes 
Protein 


The flow of genetic information in cells from DNA to mRNA to protein is 
described by the Central Dogma ({link]), which states that genes specify 
the sequence of mRNAs, which in turn specify the sequence of proteins. 
The decoding of one molecule to another is performed by specific proteins 
and RNAs. Because the information stored in DNA is so central to cellular 
function, it makes intuitive sense that the cell would make MRNA copies of 
this information for protein synthesis, while keeping the DNA itself intact 
and protected. The copying of DNA to RNA is relatively straightforward, 
with one nucleotide being added to the mRNA strand for every nucleotide 
read in the DNA strand. The translation to protein is a bit more complex 
because three mRNA nucleotides correspond to one amino acid in the 
polypeptide sequence. However, the translation to protein is still systematic 
and colinear, such that nucleotides 1 to 3 correspond to amino acid 1, 
nucleotides 4 to 6 correspond to amino acid 2, and so on. 


Transcription 
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RNA processing 


Primary RNA transcript 


Spliced RNA 
Translation 


polypeptide chain 


Ribosome 


Instructions on DNA are 
transcribed onto messenger 
RNA. Ribosomes are able to 
read the genetic information 

inscribed on a strand of 
messenger RNA and use this 
information to string amino 
acids together into a protein. 


The Genetic Code Is Degenerate and Universal 


Given the different numbers of “letters” in the mRNA and protein 
“alphabets,” scientists theorized that combinations of nucleotides 
corresponded to single amino acids. Nucleotide doublets would not be 
sufficient to specify every amino acid because there are only 16 possible 
two-nucleotide combinations (47). In contrast, there are 64 possible 
nucleotide triplets (4°), which is far more than the number of amino acids. 
Scientists theorized that amino acids were encoded by nucleotide triplets 
and that the genetic code was degenerate. In other words, a given amino 
acid could be encoded by more than one nucleotide triplet. This was later 
confirmed experimentally; Francis Crick and Sydney Brenner used the 
chemical mutagen proflavin to insert one, two, or three nucleotides into the 
gene of a virus. When one or two nucleotides were inserted, protein 
synthesis was completely abolished. When three nucleotides were inserted, 
the protein was synthesized and functional. This demonstrated that three 
nucleotides specify each amino acid. These nucleotide triplets are called 
codons. The insertion of one or two nucleotides completely changed the 
triplet reading frame, thereby altering the message for every subsequent 
amino acid ([link]). Though insertion of three nucleotides caused an extra 
amino acid to be inserted during translation, the integrity of the rest of the 
protein was maintained. 


Scientists painstakingly solved the genetic code by translating synthetic 
mRNAs in vitro and sequencing the proteins they specified ([link]). 
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This figure shows the genetic 
code for translating each 
nucleotide triplet in mRNA into 
an amino acid or a termination 
signal in a nascent protein. 
(credit: modification of work by 
NIH) 


In addition to instructing the addition of a specific amino acid to a 
polypeptide chain, three of the 64 codons terminate protein synthesis and 
release the polypeptide from the translation machinery. These triplets are 
called nonsense codons, or stop codons. Another codon, AUG, also has a 
special function. In addition to specifying the amino acid methionine, it also 
serves as the start codon to initiate translation. The reading frame for 
translation is set by the AUG start codon near the 5' end of the mRNA. 


The genetic code is universal. With a few exceptions, virtually all species 
use the same genetic code for protein synthesis. Conservation of codons 
means that a purified mRNA encoding the globin protein in horses could be 
transferred to a tulip cell, and the tulip would synthesize horse globin. That 
there is only one genetic code is powerful evidence that all of life on Earth 
shares a common origin, especially considering that there are about 10° 
possible combinations of 20 amino acids and 64 triplet codons. 


Note: 
Link to Learning 
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Transcribe a gene and translate it to protein using complementary pairing 
and the genetic code at this site. 


Frameshift Mutations 


AGCGUIACCCUAC mmAGCGCCCUACUU 


Ser Val Pro Tw = Ser. Ala Leu Leu. 


The deletion of two nucleotides shifts 
the reading frame of an MRNA and 
changes the entire protein message, 
creating a nonfunctional protein or 

terminating protein synthesis 
altogether. 


Degeneracy is believed to be a cellular mechanism to reduce the negative 
impact of random mutations. Codons that specify the same amino acid 
typically only differ by one nucleotide. In addition, amino acids with 
chemically similar side chains are encoded by similar codons. This nuance 
of the genetic code ensures that a single-nucleotide substitution mutation 
might either specify the same amino acid but have no effect or specify a 


similar amino acid, preventing the protein from being rendered completely 
nonfunctional. 


Note: 
Scientific Method Connection 
Which Has More DNA: A Kiwi or a Strawberry? 


Do you think that a kiwi or a strawberry has more 
DNA per fruit? (credit “kiwi”: "Kelbv'"/Flickr; 
credit: “strawberry”: Alisdair McDiarmid) 


Question: Would a kiwifruit and strawberry that are approximately the 
same size ({link]) also have approximately the same amount of DNA? 
Background: Genes are carried on chromosomes and are made of DNA. 
All mammals are diploid, meaning they have two copies of each 
chromosome. However, not all plants are diploid. The common strawberry 
is octoploid (8n) and the cultivated kiwi is hexaploid (6n). Research the 
total number of chromosomes in the cells of each of these fruits and think 
about how this might correspond to the amount of DNA in these fruits’ cell 
nuclei. Read about the technique of DNA isolation to understand how each 
step in the isolation protocol helps liberate and precipitate DNA. 
Hypothesis: Hypothesize whether you would be able to detect a difference 
in DNA quantity from similarly sized strawberries and kiwis. Which fruit 
do you think would yield more DNA? 

Test your hypothesis: Isolate the DNA from a strawberry and a kiwi that 
are similarly sized. Perform the experiment in at least triplicate for each 
fruit. 


1. Prepare a bottle of DNA extraction buffer from 900 mL water, 50 mL 
dish detergent, and two teaspoons of table salt. Mix by inversion (cap 
it and turn it upside down a few times). 

2. Grind a strawberry and a kiwifruit by hand in a plastic bag, or using a 
mortar and pestle, or with a metal bowl and the end of a blunt 
instrument. Grind for at least two minutes per fruit. 


3. Add 10 mL of the DNA extraction buffer to each fruit, and mix well 
for at least one minute. 

4. Remove cellular debris by filtering each fruit mixture through 
cheesecloth or porous cloth and into a funnel placed in a test tube or 
an appropriate container. 

5. Pour ice-cold ethanol or isopropanol (rubbing alcohol) into the test 
tube. You should observe white, precipitated DNA. 

6. Gather the DNA from each fruit by winding it around separate glass 
rods. 


Record your observations: Because you are not quantitatively measuring 
DNA volume, you can record for each trial whether the two fruits 
produced the same or different amounts of DNA as observed by eye. If one 
or the other fruit produced noticeably more DNA, record this as well. 
Determine whether your observations are consistent with several pieces of 
each fruit. 

Analyze your data: Did you notice an obvious difference in the amount of 
DNA produced by each fruit? Were your results reproducible? 

Draw a conclusion: Given what you know about the number of 
chromosomes in each fruit, can you conclude that chromosome number 
necessarily correlates to DNA amount? Can you identify any drawbacks to 
this procedure? If you had access to a laboratory, how could you 
standardize your comparison and make it more quantitative? 


Section Summary 


The genetic code refers to the DNA alphabet (A, T, C, G), the RNA 
alphabet (A, U, C, G), and the polypeptide alphabet (20 amino acids). The 
Central Dogma describes the flow of genetic information in the cell from 
genes to MRNA to proteins. Genes are used to make mRNA by the process 
of transcription; mRNA is used to synthesize proteins by the process of 
translation. The genetic code is degenerate because 64 triplet codons in 
mRNA specify only 20 amino acids and three nonsense codons. Almost 
every species on the planet uses the same genetic code. 


Review Questions 


Exercise: 


Problem: 


The AUC and AUA codons in mRNA both specify isoleucine. What 
feature of the genetic code explains this? 


a. complementarity 
b. nonsense codons 
c. universality 
d. degeneracy 


Solution: 


1B) 


Exercise: 


Problem:How many nucleotides are in 12 mRNA codons? 


Solution: 


C 


Free Response 


Exercise: 


Problem: 


Imagine if there were 200 commonly occurring amino acids instead of 
20. Given what you know about the genetic code, what would be the 
shortest possible codon length? Explain. 


Solution: 


For 200 commonly occurring amino acids, codons consisting of four 
types of nucleotides would have to be at least four nucleotides long, 
because 44 = 256. There would be much less degeneracy in this case. 


Exercise: 


Problem: 


Discuss how degeneracy of the genetic code makes cells more robust 
to mutations. 


Solution: 


Codons that specify the same amino acid typically only differ by one 
nucleotide. In addition, amino acids with chemically similar side 
chains are encoded by similar codons. This nuance of the genetic code 
ensures that a single-nucleotide substitution mutation might either 
specify the same amino acid and have no effect, or may specify a 
similar amino acid, preventing the protein from being rendered 
completely nonfunctional. 


Glossary 


Central Dogma 
States that genes specify the sequence of mRNAs, which in turn 
specify the sequence of proteins 


codon 
three consecutive nucleotides in mRNA that specify the insertion of an 
amino acid or the release of a polypeptide chain during translation 


colinear 
in terms of RNA and protein, three “units” of RNA (nucleotides) 
specify one “unit” of protein (amino acid) in a consecutive fashion 


degeneracy 
(of the genetic code) describes that a given amino acid can be encoded 
by more than one nucleotide triplet; the code is degenerate, but not 
ambiguous 


nonsense codon 
one of the three mRNA codons that specifies termination of translation 


reading frame 
sequence of triplet codons in mRNA that specify a particular protein; a 
ribosome shift of one or two nucleotides in either direction completely 
abolishes synthesis of that protein 


Prokaryotic Transcription 
By the end of this section, you will be able to: 


e List the different steps in prokaryotic transcription 
e Discuss the role of promoters in prokaryotic transcription 
e Describe how and when transcription is terminated 


The prokaryotes, which include bacteria and archaea, are mostly single- 
celled organisms that, by definition, lack membrane-bound nuclei and other 
organelles. A bacterial chromosome is a covalently closed circle that, unlike 
eukaryotic chromosomes, is not organized around histone proteins. The 
central region of the cell in which prokaryotic DNA resides is called the 
nucleoid. In addition, prokaryotes often have abundant plasmids, which are 
shorter circular DNA molecules that may only contain one or a few genes. 
Plasmids can be transferred independently of the bacterial chromosome 
during cell division and often carry traits such as antibiotic resistance. 


Transcription in prokaryotes (and in eukaryotes) requires the DNA double 
helix to partially unwind in the region of mRNA synthesis. The region of 
unwinding is called a transcription bubble. Transcription always proceeds 
from the same DNA strand for each gene, which is called the template 
strand. The mRNA product is complementary to the template strand and is 
almost identical to the other DNA strand, called the nontemplate strand. 
The only difference is that in mRNA, all of the T nucleotides are replaced 
with U nucleotides. In an RNA double helix, A can bind U via two 
hydrogen bonds, just as in A-T pairing ina DNA double helix. 


The nucleotide pair in the DNA double helix that corresponds to the site 
from which the first 5' mRNA nucleotide is transcribed is called the +1 site, 
or the initiation site. Nucleotides preceding the initiation site are given 
negative numbers and are designated upstream. Conversely, nucleotides 
following the initiation site are denoted with “+” numbering and are called 
downstream nucleotides. 


Initiation of Transcription in Prokaryotes 


Prokaryotes do not have membrane-enclosed nuclei. Therefore, the 
processes of transcription, translation, and MRNA degradation can all occur 
simultaneously. The intracellular level of a bacterial protein can quickly be 
amplified by multiple transcription and translation events occurring 
concurrently on the same DNA template. Prokaryotic transcription often 
covers more than one gene and produces polycistronic mRNAs that specify 
more than one protein. 


Our discussion here will exemplify transcription by describing this process 
in Escherichia coli, a well-studied bacterial species. Although some 
differences exist between transcription in E. coli and transcription in 
archaea, an understanding of E. coli transcription can be applied to virtually 
all bacterial species. 


Prokaryotic RNA Polymerase 


Prokaryotes use the same RNA polymerase to transcribe all of their genes. 
In E. coli, the polymerase is composed of five polypeptide subunits, two of 
which are identical. Four of these subunits, denoted a, a, 6, and f' comprise 
the polymerase core enzyme. These subunits assemble every time a gene is 
transcribed, and they disassemble once transcription is complete. Each 
subunit has a unique role; the two a-subunits are necessary to assemble the 
polymerase on the DNA; the f-subunit binds to the ribonucleoside 
triphosphate that will become part of the nascent “recently born” mRNA 
molecule; and the f' binds the DNA template strand. The fifth subunit, o, is 
involved only in transcription initiation. It confers transcriptional specificity 
such that the polymerase begins to synthesize mRNA from an appropriate 
initiation site. Without o, the core enzyme would transcribe from random 
sites and would produce mRNA molecules that specified protein gibberish. 
The polymerase comprised of all five subunits is called the holoenzyme. 


Prokaryotic Promoters 


A promoter is a DNA sequence onto which the transcription machinery 
binds and initiates transcription. In most cases, promoters exist upstream of 
the genes they regulate. The specific sequence of a promoter is very 
important because it determines whether the corresponding gene is 
transcribed all the time, some of the time, or infrequently. Although 
promoters vary among prokaryotic genomes, a few elements are conserved. 
At the -10 and -35 regions upstream of the initiation site, there are two 
promoter consensus sequences, or regions that are similar across all 
promoters and across various bacterial species ({link]). The -10 consensus 
sequence, called the -10 region, is TATAAT. The -35 sequence, TTGACA, 
is recognized and bound by o. Once this interaction is made, the subunits of 
the core enzyme bind to the site. The A—T-rich -10 region facilitates 
unwinding of the DNA template, and several phosphodiester bonds are 
made. The transcription initiation phase ends with the production of 
abortive transcripts, which are polymers of approximately 10 nucleotides 
that are made and released. 


+1 Transcription 
start site 


RNA Polymerase 


The o subunit of prokaryotic RNA 
polymerase recognizes consensus 
sequences found in the promoter 
region upstream of the transcription 
start sight. The o subunit dissociates 
from the polymerase after 
transcription has been initiated. 


Note: 
Link to Learning 
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View this MolecularMovies animation to see the first part of transcription 
and the base sequence repetition of the TATA box. 


Elongation and Termination in Prokaryotes 


The transcription elongation phase begins with the release of the o subunit 
from the polymerase. The dissociation of o allows the core enzyme to 
proceed along the DNA template, synthesizing mRNA in the 5' to 3' 
direction at a rate of approximately 40 nucleotides per second. As 
elongation proceeds, the DNA is continuously unwound ahead of the core 
enzyme and rewound behind it ([link]). The base pairing between DNA and 
RNA is not stable enough to maintain the stability of the mRNA synthesis 
components. Instead, the RNA polymerase acts as a stable linker between 
the DNA template and the nascent RNA strands to ensure that elongation is 
not interrupted prematurely. 


Transcription 


TAcecccTT An Yuc GUS ATGCTGCAT 
3! 414 ACGCACUCAY © 4G 5! 
TGCGTGAGTA 
DNA RNA polymerase 


RNA processing 
Primary RNA transcript 


Spliced RNA 


Translation 
polypeptide chain 


Ribosome 


During elongation, the prokaryotic RNA 
polymerase tracks along the DNA template, 


synthesizes mRNA in the 5' to 3' direction, and 
unwinds and rewinds the DNA as it is read. 


Prokaryotic Termination Signals 


Once a gene is transcribed, the prokaryotic polymerase needs to be 
instructed to dissociate from the DNA template and liberate the newly made 
mRNA. Depending on the gene being transcribed, there are two kinds of 
termination signals. One is protein-based and the other is RNA-based. Rho- 
dependent termination is controlled by the rho protein, which tracks along 
behind the polymerase on the growing mRNA chain. Near the end of the 
gene, the polymerase encounters a run of G nucleotides on the DNA 
template and it stalls. As a result, the rho protein collides with the 
polymerase. The interaction with rho releases the mRNA from the 
transcription bubble. 


Rho-independent termination is controlled by specific sequences in the 
DNA template strand. As the polymerase nears the end of the gene being 
transcribed, it encounters a region rich in C—G nucleotides. The mRNA 
folds back on itself, and the complementary C-—G nucleotides bind together. 
The result is a stable hairpin that causes the polymerase to stall as soon as 
it begins to transcribe a region rich in A-T nucleotides. The complementary 
U-A region of the mRNA transcript forms only a weak interaction with the 
template DNA. This, coupled with the stalled polymerase, induces enough 
instability for the core enzyme to break away and liberate the new MRNA 
transcript. 


Upon termination, the process of transcription is complete. By the time 
termination occurs, the prokaryotic transcript would already have been used 
to begin synthesis of numerous copies of the encoded protein because these 
processes can occur concurrently. The unification of transcription, 
translation, and even mRNA degradation is possible because all of these 
processes occur in the same 5' to 3' direction, and because there is no 
membranous compartmentalization in the prokaryotic cell ({link]). In 


contrast, the presence of a nucleus in eukaryotic cells precludes 
simultaneous transcription and translation. 


+t Polyribosome 


Multiple polymerases can transcribe a 
single bacterial gene while numerous 
ribosomes concurrently translate the 
mRNA transcripts into polypeptides. 
In this way, a specific protein can 
rapidly reach a high concentration in 
the bacterial cell. 


Note: 
Link to Learning 


ae 


Visit this BioStudio animation to see the process of prokaryotic 
transcription. 


Section Summary 


In prokaryotes, mRNA synthesis is initiated at a promoter sequence on the 
DNA template comprising two consensus sequences that recruit RNA 
polymerase. The prokaryotic polymerase consists of a core enzyme of four 
protein subunits and a o protein that assists only with initiation. Elongation 
synthesizes mRNA in the 5' to 3' direction at a rate of 40 nucleotides per 
second. Termination liberates the mRNA and occurs either by rho protein 
interaction or by the formation of an mRNA hairpin. 


Review Questions 


Exercise: 


Problem: 


Which subunit of the E. coli polymerase confers specificity to 
transcription? 


AA op 
AWWR 


Solution: 


D 
Exercise: 
Problem: 


The -10 and -35 regions of prokaryotic promoters are called consensus 
sequences because 


a. they are identical in all bacterial species 
b. they are similar in all bacterial species 
c. they exist in all organisms 


d. they have the same function in all organisms 


Solution: 


B 


Free Response 


Exercise: 


Problem: 


If mRNA is complementary to the DNA template strand and the DNA 
template strand is complementary to the DNA nontemplate strand, 
then why are base sequences of mRNA and the DNA nontemplate 
strand not identical? Could they ever be? 


Solution: 


DNA is different from RNA in that T nucleotides in DNA are replaced 
with U nucleotides in RNA. Therefore, they could never be identical in 
base sequence. 


Exercise: 


Problem: 


In your own words, describe the difference between rho-dependent and 
rho-independent termination of transcription in prokaryotes. 


Solution: 


Rho-dependent termination is controlled by the rho protein, which 
tracks along behind the polymerase on the growing mRNA chain. Near 
the end of the gene, the polymerase stalls at a run of G nucleotides on 
the DNA template. The rho protein collides with the polymerase and 
releases MRNA from the transcription bubble. Rho-independent 
termination is controlled by specific sequences in the DNA template 


strand. As the polymerase nears the end of the gene being transcribed, 
it encounters a region rich in C—G nucleotides. This creates an MRNA 
hairpin that causes the polymerase to stall right as it begins to 
transcribe a region rich in A—T nucleotides. Because A—U bonds are 
less thermostable, the core enzyme falls away. 


Glossary 


consensus 
DNA sequence that is used by many species to perform the same or 
similar functions 


core enzyme 
prokaryotic RNA polymerase consisting of a, a, B, and f' but missing 
0; this complex performs elongation 


downstream 
nucleotides following the initiation site in the direction of MRNA 
transcription; in general, sequences that are toward the 3' end relative 
to a site on the mRNA 


hairpin 
structure of RNA when it folds back on itself and forms intramolecular 
hydrogen bonds between complementary nucleotides 


holoenzyme 
prokaryotic RNA polymerase consisting of a, a, B, 6’, and o; this 
complex is responsible for transcription initiation 


initiation site 
nucleotide from which mRNA synthesis proceeds in the 5' to 3' 
direction; denoted with a “+1” 


nontemplate strand 
strand of DNA that is not used to transcribe mRNA; this strand is 
identical to the mRNA except that T nucleotides in the DNA are 
replaced by U nucleotides in the mRNA 


plasmid 
extrachromosomal, covalently closed, circular DNA molecule that may 
only contain one or a few genes; common in prokaryotes 


promoter 
DNA sequence to which RNA polymerase and associated factors bind 
and initiate transcription 


Rho-dependent termination 
in prokaryotes, termination of transcription by an interaction between 
RNA polymerase and the rho protein at a run of G nucleotides on the 
DNA template 


Rho-independent 
termination sequence-dependent termination of prokaryotic mRNA 
synthesis; caused by hairpin formation in the mRNA that stalls the 
polymerase 


TATA box 
conserved promoter sequence in eukaryotes and prokaryotes that helps 
to establish the initiation site for transcription 


template strand 
strand of DNA that specifies the complementary mRNA molecule 


transcription bubble 
region of locally unwound DNA that allows for transcription of 
mRNA 


upstream 
nucleotides preceding the initiation site; in general, sequences toward 
the 5' end relative to a site on the mRNA 


Eukaryotic Transcription 
By the end of this section, you will be able to: 


e List the steps in eukaryotic transcription 

e Discuss the role of RNA polymerases in transcription 
e Compare and contrast the three RNA polymerases 

e Explain the significance of transcription factors 


Prokaryotes and eukaryotes perform fundamentally the same process of 
transcription, with a few key differences. The most important difference 
between prokaryotes and eukaryotes is the latter’s membrane-bound 
nucleus and organelles. With the genes bound in a nucleus, the eukaryotic 
cell must be able to transport its mRNA to the cytoplasm and must protect 
its MRNA from degrading before it is translated. Eukaryotes also employ 
three different polymerases that each transcribe a different subset of genes. 
Eukaryotic mRNAs are usually monogenic, meaning that they specify a 
single protein. 


Initiation of Transcription in Eukaryotes 


Unlike the prokaryotic polymerase that can bind to a DNA template on its 
own, eukaryotes require several other proteins, called transcription factors, 
to first bind to the promoter region and then help recruit the appropriate 
polymerase. 


The Three Eukaryotic RNA Polymerases 


The features of eukaryotic MRNA synthesis are markedly more complex 
those of prokaryotes. Instead of a single polymerase comprising five 
subunits, the eukaryotes have three polymerases that are each made up of 
10 subunits or more. Each eukaryotic polymerase also requires a distinct set 
of transcription factors to bring it to the DNA template. 


RNA polymerase I is located in the nucleolus, a specialized nuclear 
substructure in which ribosomal RNA (rRNA) is transcribed, processed, 
and assembled into ribosomes ({link]). The rRNA molecules are considered 


structural RNAs because they have a cellular role but are not translated into 
protein. The rRNAs are components of the ribosome and are essential to the 
process of translation. RNA polymerase I synthesizes all of the rRNAs 
except for the 5S rRNA molecule. The “S” designation applies to 
“Svedberg” units, a nonadditive value that characterizes the speed at which 
a particle sediments during centrifugation. 


Locations, Products, and Sensitivities of the Three Eukaryotic 
RNA Polymerases 


d- 

RNA Cellular Product of Amanitin 

Polymerase Compartment Transcription Sensitivity 
All rRNAs 

I Nucleolus except 5S Insensitive 
rRNA 
All protein- 

I Nucleus coding nuclear oe 
pre-mRNAs 
5S rRNA, 

i Naclene tRNAs, and Moderately 
small nuclear sensitive 
RNAs 


RNA polymerase II is located in the nucleus and synthesizes all protein- 
coding nuclear pre-mRNAs. Eukaryotic pre-mRNAs undergo extensive 
processing after transcription but before translation. For clarity, this 
module’s discussion of transcription and translation in eukaryotes will use 
the term “mRNAs” to describe only the mature, processed molecules that 


are ready to be translated. RNA polymerase II is responsible for 
transcribing the overwhelming majority of eukaryotic genes. 


RNA polymerase III is also located in the nucleus. This polymerase 
transcribes a variety of structural RNAs that includes the 5S pre-rRNA, 
transfer pre-RNAs (pre-tRNAs), and small nuclear pre-RNAs. The tRNAs 
have a critical role in translation; they serve as the adaptor molecules 
between the mRNA template and the growing polypeptide chain. Small 
nuclear RNAs have a variety of functions, including “splicing” pre-mRNAs 
and regulating transcription factors. 


A scientist characterizing a new gene can determine which polymerase 
transcribes it by testing whether the gene is expressed in the presence of a 
particular mushroom poison, a-amanitin ({link]). Interestingly, a-amanitin 
produced by Amanita phalloides, the Death Cap mushroom, affects the 
three polymerases very differently. RNA polymerase I is completely 
insensitive to a-amanitin, meaning that the polymerase can transcribe DNA 
in vitro in the presence of this poison. In contrast, RNA polymerase II is 
extremely sensitive to a-amanitin, and RNA polymerase III is moderately 
sensitive. Knowing the transcribing polymerase can clue a researcher into 
the general function of the gene being studied. Because RNA polymerase II 
transcribes the vast majority of genes, we will focus on this polymerase in 
our subsequent discussions about eukaryotic transcription factors and 
promoters. 


Structure of an RNA Polymerase IT Promoter 


Eukaryotic promoters are much larger and more complex than prokaryotic 
promoters, but both have a TATA box. For example, in the mouse thymidine 
kinase gene, the TATA box is located at approximately -30 relative to the 
initiation (+1) site ({link]). For this gene, the exact TATA box sequence is 
TATAAAA, as read in the 5' to 3' direction on the nontemplate strand. This 
sequence is not identical to the E. coli TATA box, but it conserves the A-T 
rich element. The thermostability of A-T bonds is low and this helps the 
DNA template to locally unwind in preparation for transcription. 


Promoter 


30 


+ 
B 


Transcription 
start site 


= 


+ 
rR 


Transcription 
start site 


' 


Transcription 
Start site 


; 


+ 
RNA Polymerase II Transcription 
start site 


A generalized promoter of a 
gene transcribed by RNA 
polymerase II is shown. 
Transcription factors 
recognize the promoter. RNA 
polymerase II then binds and 
forms the transcription 
initiation complex. 


Note: 
Art Connection 


Primary RNA transcript 


{ RNA processing 


Spliced RNA 


5' cap Poly-A tail 


5‘ untranslated 3' untranslated 
region region 


Eukaryotic mRNA contains 
introns that must be spliced out. A 
5' cap and 3' poly-A tail are also 
added. 


A scientist splices a eukaryotic promoter in front of a bacterial gene and 
inserts the gene in a bacterial chromosome. Would you expect the bacteria 
to transcribe the gene? 


The mouse genome includes one gene and two pseudogenes for 
cytoplasmic thymidine kinase. Pseudogenes are genes that have lost their 
protein-coding ability or are no longer expressed by the cell. These 
pseudogenes are copied from mRNA and incorporated into the 
chromosome. For example, the mouse thymidine kinase promoter also has a 
conserved CAAT box (GGCCAATCT) at approximately -80. This 
sequence is essential and is involved in binding transcription factors. 
Further upstream of the TATA box, eukaryotic promoters may also contain 
one or more GC-rich boxes (GGCG) or octamer boxes (ATTTGCAT). 
These elements bind cellular factors that increase the efficiency of 
transcription initiation and are often identified in more “active” genes that 
are constantly being expressed by the cell. 


Transcription Factors for RNA Polymerase II 


The complexity of eukaryotic transcription does not end with the 
polymerases and promoters. An army of basal transcription factors, 
enhancers, and silencers also help to regulate the frequency with which pre- 
mRNA is synthesized from a gene. Enhancers and silencers affect the 
efficiency of transcription but are not necessary for transcription to proceed. 
Basal transcription factors are crucial in the formation of a preinitiation 
complex on the DNA template that subsequently recruits RNA polymerase 
II for transcription initiation. 


The names of the basal transcription factors begin with “TFII” (this is the 
transcription factor for RNA polymerase IT) and are specified with the 
letters A—J. The transcription factors systematically fall into place on the 
DNA template, with each one further stabilizing the preinitiation complex 
and contributing to the recruitment of RNA polymerase II. 


The processes of bringing RNA polymerases I and III to the DNA template 
involve slightly less complex collections of transcription factors, but the 
general theme is the same. Eukaryotic transcription is a tightly regulated 
process that requires a variety of proteins to interact with each other and 
with the DNA strand. Although the process of transcription in eukaryotes 
involves a greater metabolic investment than in prokaryotes, it ensures that 
the cell transcribes precisely the pre-mRNAs that it needs for protein 
synthesis. 


Note: 

Evolution Connection 

The Evolution of Promoters 

The evolution of genes may be a familiar concept. Mutations can occur in 
genes during DNA replication, and the result may or may not be beneficial 
to the cell. By altering an enzyme, structural protein, or some other factor, 
the process of mutation can transform functions or physical features. 
However, eukaryotic promoters and other gene regulatory sequences may 
evolve as well. For instance, consider a gene that, over many generations, 
becomes more valuable to the cell. Maybe the gene encodes a structural 
protein that the cell needs to synthesize in abundance for a certain function. 
If this is the case, it would be beneficial to the cell for that gene’s promoter 


to recruit transcription factors more efficiently and increase gene 
expression. 

Scientists examining the evolution of promoter sequences have reported 
varying results. In part, this is because it is difficult to infer exactly where a 
eukaryotic promoter begins and ends. Some promoters occur within genes; 
others are located very far upstream, or even downstream, of the genes 
they are regulating. However, when researchers limited their examination 
to human core promoter sequences that were defined experimentally as 
sequences that bind the preinitiation complex, they found that promoters 
evolve even faster than protein-coding genes. 

It is still unclear how promoter evolution might correspond to the evolution 
of humans or other higher organisms. However, the evolution of a 
promoter to effectively make more or less of a given gene product is an 
intriguing alternative to the evolution of the genes themselves, o™ote!] 

H Liang et al., “Fast evolution of core promoters in primate genomes,” 
Molecular Biology and Evolution 25 (2008): 1239-44. 


Promoter Structures for RNA Polymerases I and III 


In eukaryotes, the conserved promoter elements differ for genes transcribed 
by RNA polymerases I, II, and III. RNA polymerase I transcribes genes that 
have two GC-rich promoter sequences in the -45 to +20 region. These 
sequences alone are sufficient for transcription initiation to occur, but 
promoters with additional sequences in the region from -180 to -105 
upstream of the initiation site will further enhance initiation. Genes that are 
transcribed by RNA polymerase III have upstream promoters or promoters 
that occur within the genes themselves. 


Eukaryotic Elongation and Termination 


Following the formation of the preinitiation complex, the polymerase is 
released from the other transcription factors, and elongation is allowed to 
proceed as it does in prokaryotes with the polymerase synthesizing pre- 
mRNA in the 5' to 3' direction. As discussed previously, RNA polymerase 


II transcribes the major share of eukaryotic genes, so this section will focus 
on how this polymerase accomplishes elongation and termination. 


Although the enzymatic process of elongation is essentially the same in 
eukaryotes and prokaryotes, the DNA template is more complex. When 
eukaryotic cells are not dividing, their genes exist as a diffuse mass of DNA 
and proteins called chromatin. The DNA is tightly packaged around 
charged histone proteins at repeated intervals. These DNA-—histone 
complexes, collectively called nucleosomes, are regularly spaced and 
include 146 nucleotides of DNA wound around eight histones like thread 
around a spool. 


For polynucleotide synthesis to occur, the transcription machinery needs to 
move histones out of the way every time it encounters a nucleosome. This 
is accomplished by a special protein complex called FACT, which stands 
for “facilitates chromatin transcription.” This complex pulls histones away 
from the DNA template as the polymerase moves along it. Once the pre- 
mRNA is synthesized, the FACT complex replaces the histones to recreate 
the nucleosomes. 


The termination of transcription is different for the different polymerases. 
Unlike in prokaryotes, elongation by RNA polymerase II in eukaryotes 
takes place 1,000—2,000 nucleotides beyond the end of the gene being 
transcribed. This pre-mRNA tail is subsequently removed by cleavage 
during mRNA processing. On the other hand, RNA polymerases I and III 
require termination signals. Genes transcribed by RNA polymerase I 
contain a specific 18-nucleotide sequence that is recognized by a 
termination protein. The process of termination in RNA polymerase III 
involves an MRNA hairpin similar to rho-independent termination of 
transcription in prokaryotes. 


Section Summary 


Transcription in eukaryotes involves one of three types of polymerases, 
depending on the gene being transcribed. RNA polymerase II transcribes all 
of the protein-coding genes, whereas RNA polymerase I transcribes rRNA 
genes, and RNA polymerase III transcribes rRNA, tRNA, and small nuclear 


RNA genes. The initiation of transcription in eukaryotes involves the 
binding of several transcription factors to complex promoter sequences that 
are usually located upstream of the gene being copied. The mRNA is 
synthesized in the 5' to 3' direction, and the FACT complex moves and 
reassembles nucleosomes as the polymerase passes by. Whereas RNA 
polymerases I and III terminate transcription by protein- or RNA hairpin- 
dependent methods, RNA polymerase II transcribes for 1,000 or more 
nucleotides beyond the gene template and cleaves the excess during pre- 
mRNA processing. 


Art Connections 


Exercise: 


Problem: 


[link] A scientist splices a eukaryotic promoter in front of a bacterial 
gene and inserts the gene in a bacterial chromosome. Would you 
expect the bacteria to transcribe the gene? 


Solution: 


[link] No. Prokaryotes use different promoters than eukaryotes. 


Review Questions 


Exercise: 


Problem: 


Which feature of promoters can be found in both prokaryotes and 
eukaryotes? 


a. GC box 

b. TATA box 

c. octamer box 

d. -10 and -35 sequences 


Solution: 


B 
Exercise: 


Problem: 
What transcripts will be most affected by low levels of a-amanitin? 


a. 18S and 28S rRNAs 

b. pre-mRNAs 

c. 5S rRNAs and tRNAs 

d. other small nuclear RNAs 


Solution: 


B 


Glossary 


CAAT box 
(GGCCAATCT) essential eukaryotic promoter sequence involved in 
binding transcription factors 


FACT 
complex that “facilitates chromatin transcription” by disassembling 
nucleosomes ahead of a transcribing RNA polymerase II and 
reassembling them after the polymerase passes by 


GC-rich box 
(GGCG) nonessential eukaryotic promoter sequence that binds cellular 
factors to increase the efficiency of transcription; may be present 
several times in a promoter 


Octamer box 


(ATTTGCAT) nonessential eukaryotic promoter sequence that binds 
cellular factors to increase the efficiency of transcription; may be 
present several times in a promoter 


preinitiation complex 
cluster of transcription factors and other proteins that recruit RNA 
polymerase II for transcription of a DNA template 


small nuclear RNA 
molecules synthesized by RNA polymerase III that have a variety of 
functions, including splicing pre-mRNAs and regulating transcription 
factors 


RNA Processing in Eukaryotes 
By the end of this section, you will be able to: 


¢ Describe the different steps in RNA processing 
e Understand the significance of exons, introns, and splicing 
e Explain how tRNAs and rRNAs are processed 


After transcription, eukaryotic pre-mRNAs must undergo several 
processing steps before they can be translated. Eukaryotic (and prokaryotic) 
tRNAs and rRNAs also undergo processing before they can function as 
components in the protein synthesis machinery. 


mRNA Processing 


The eukaryotic pre-mRNA undergoes extensive processing before it is 
ready to be translated. The additional steps involved in eukaryotic MRNA 
maturation create a molecule with a much longer half-life than a 
prokaryotic MRNA. Eukaryotic mRNAs last for several hours, whereas the 
typical E. coli mRNA lasts no more than five seconds. 


Pre-mRNAs are first coated in RNA-stabilizing proteins; these protect the 
pre-mRNA from degradation while it is processed and exported out of the 
nucleus. The three most important steps of pre-mRNA processing are the 
addition of stabilizing and signaling factors at the 5' and 3' ends of the 
molecule, and the removal of intervening sequences that do not specify the 
appropriate amino acids. In rare cases, the mRNA transcript can be “edited” 
after it is transcribed. 


Note: 

Evolution Connection 

RNA Editing in Trypanosomes 

The trypanosomes are a group of protozoa that include the pathogen 
Trypanosoma brucei, which causes sleeping sickness in humans ((link]). 
Trypanosomes, and virtually all other eukaryotes, have organelles called 
mitochondria that supply the cell with chemical energy. Mitochondria are 


organelles that express their own DNA and are believed to be the remnants 
of a symbiotic relationship between a eukaryote and an engulfed 
prokaryote. The mitochondrial DNA of trypanosomes exhibit an 
interesting exception to The Central Dogma: their pre-mRNAs do not have 
the correct information to specify a functional protein. Usually, this is 
because the mRNA is missing several U nucleotides. The cell performs an 
additional RNA processing step called RNA editing to remedy this. 


Trypanosoma brucei is 
the causative agent of 
sleeping sickness in 
humans. The mRNAs of 
this pathogen must be 
modified by the addition 
of nucleotides before 
protein synthesis can 
occur. (credit: 
modification of work by 
Torsten Ochsenreiter) 


Other genes in the mitochondrial genome encode 40- to 80-nucleotide 
guide RNAs. One or more of these molecules interacts by complementary 
base pairing with some of the nucleotides in the pre-mRNA transcript. 
However, the guide RNA has more A nucleotides than the pre-mRNA has 
U nucleotides to bind with. In these regions, the guide RNA loops out. The 


3' ends of guide RNAs have a long poly-U tail, and these U bases are 
inserted in regions of the pre-mRNA transcript at which the guide RNAs 
are looped. This process is entirely mediated by RNA molecules. That is, 
guide RNAs—rather than proteins—serve as the catalysts in RNA editing. 
RNA editing is not just a phenomenon of trypanosomes. In the 
mitochondria of some plants, almost all pre-mRNAs are edited. RNA 
editing has also been identified in mammals such as rats, rabbits, and even 
humans. What could be the evolutionary reason for this additional step in 
pre-mRNA processing? One possibility is that the mitochondria, being 
remnants of ancient prokaryotes, have an equally ancient RNA-based 
method for regulating gene expression. In support of this hypothesis, edits 
made to pre-mRNAs differ depending on cellular conditions. Although 
speculative, the process of RNA editing may be a holdover from a 
primordial time when RNA molecules, instead of proteins, were 
responsible for catalyzing reactions. 


5' Capping 


While the pre-mRNA is still being synthesized, a 7-methylguanosine cap 
is added to the 5' end of the growing transcript by a phosphate linkage. This 
moiety (functional group) protects the nascent mRNA from degradation. In 
addition, factors involved in protein synthesis recognize the cap to help 
initiate translation by ribosomes. 


3' Poly-A Tail 


Once elongation is complete, the pre-mRNA is cleaved by an endonuclease 
between an AAUAAA consensus sequence and a GU-rich sequence, 
leaving the AAUAAA sequence on the pre-mRNA. An enzyme called poly- 
A polymerase then adds a string of approximately 200 A residues, called 
the poly-A tail. This modification further protects the pre-mRNA from 
degradation and signals the export of the cellular factors that the transcript 
needs to the cytoplasm. 


Pre-mRNA Splicing 


Eukaryotic genes are composed of exons, which correspond to protein- 
coding sequences (ex-on signifies that they are expressed), and intervening 
sequences called introns (int-ron denotes their intervening role), which may 
be involved in gene regulation but are removed from the pre-mRNA during 
processing. Intron sequences in mRNA do not encode functional proteins. 


The discovery of introns came as a surprise to researchers in the 1970s who 
expected that pre-mRNAs would specify protein sequences without further 
processing, as they had observed in prokaryotes. The genes of higher 
eukaryotes very often contain one or more introns. These regions may 
correspond to regulatory sequences; however, the biological significance of 
having many introns or having very long introns in a gene is unclear. It is 
possible that introns slow down gene expression because it takes longer to 
transcribe pre-mRNAs with lots of introns. Alternatively, introns may be 
nonfunctional sequence remnants left over from the fusion of ancient genes 
throughout evolution. This is supported by the fact that separate exons often 
encode separate protein subunits or domains. For the most part, the 
sequences of introns can be mutated without ultimately affecting the protein 
product. 


All of a pre-mRNA’s introns must be completely and precisely removed 
before protein synthesis. If the process errs by even a single nucleotide, the 
reading frame of the rejoined exons would shift, and the resulting protein 
would be dysfunctional. The process of removing introns and reconnecting 
exons is called splicing ([link]). Introns are removed and degraded while 
the pre-mRNA is still in the nucleus. Splicing occurs by a sequence-specific 
mechanism that ensures introns will be removed and exons rejoined with 
the accuracy and precision of a single nucleotide. The splicing of pre- 
mRNAs is conducted by complexes of proteins and RNA molecules called 
spliceosomes. 


Note: 
Art Connection 


snRNPs Intron 


Spliceosome 


U 


Pre-mRNA splicing involves 
the precise removal of introns 
from the primary RNA 
transcript. The splicing 
process is catalyzed by 
protein complexes called 
spliceosomes that are 
composed of proteins and 
RNA molecules called 
snRNAs. Spliceosomes 
recognize sequences at the 5' 
and 3' end of the intron. 


Errors in splicing are implicated in cancers and other human diseases. 
What kinds of mutations might lead to splicing errors? Think of different 
possible outcomes if splicing errors occur. 


Note that more than 70 individual introns can be present, and each has to 
undergo the process of splicing—in addition to 5' capping and the addition 


of a poly-A tail—just to generate a single, translatable mRNA molecule. 


Note: 
Link to Learning 
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See how introns are removed during RNA splicing at this website. 


Processing of tRNAs and rRNAs 


The tRNAs and rRNAs are structural molecules that have roles in protein 
synthesis; however, these RNAs are not themselves translated. Pre-rRNAs 
are transcribed, processed, and assembled into ribosomes in the nucleolus. 
Pre-tRNAs are transcribed and processed in the nucleus and then released 
into the cytoplasm where they are linked to free amino acids for protein 
synthesis. 


Most of the tRNAs and rRNAs in eukaryotes and prokaryotes are first 
transcribed as a long precursor molecule that spans multiple rRNAs or 
tRNAs. Enzymes then cleave the precursors into subunits corresponding to 
each structural RNA. Some of the bases of pre-rRNAs are methylated; that 
is, a—CH3 moiety (methyl functional group) is added for stability. Pre- 
tRNA molecules also undergo methylation. As with pre-mRNAs, subunit 
excision occurs in eukaryotic pre-RNAs destined to become tRNAs or 
rRNAs. 


Mature rRNAs make up approximately 50 percent of each ribosome. Some 
of aribosome’s RNA molecules are purely structural, whereas others have 
catalytic or binding activities. Mature tRNAs take on a three-dimensional 


structure through intramolecular hydrogen bonding to position the amino 
acid binding site at one end and the anticodon at the other end ((link]). The 
anticodon is a three-nucleotide sequence in a tRNA that interacts with an 
mRNA codon through complementary base pairing. 


Amino acid (phenylalanine) 
attachment site 


Phenylalanine tRNA 


This is a space-filling model of 
atRNA molecule that adds the 
amino acid phenylalanine to a 

growing polypeptide chain. The 

anticodon AAG binds the 
Codon UUC on the mRNA. The 
amino acid phenylalanine is 
attached to the other end of the 
tRNA. 


Section Summary 


Eukaryotic pre-mRNAs are modified with a 5' methylguanosine cap and a 
poly-A tail. These structures protect the mature mRNA from degradation 
and help export it from the nucleus. Pre-emRNAs also undergo splicing, in 
which introns are removed and exons are reconnected with single- 
nucleotide accuracy. Only finished mRNAs that have undergone 5' capping, 
3' polyadenylation, and intron splicing are exported from the nucleus to the 
cytoplasm. Pre-rRNAs and pre-tRNAs may be processed by intramolecular 
cleavage, splicing, methylation, and chemical conversion of nucleotides. 
Rarely, RNA editing is also performed to insert missing bases after an 
mRNA has been synthesized. 


Art Connections 


Exercise: 


Problem: 


[link] Errors in splicing are implicated in cancers and other human 
diseases. What kinds of mutations might lead to splicing errors? Think 
of different possible outcomes if splicing errors occur. 


Solution: 


[link] Mutations in the spliceosome recognition sequence at each end 
of the intron, or in the proteins and RNAs that make up the 
spliceosome, may impair splicing. Mutations may also add new 
spliceosome recognition sites. Splicing errors could lead to introns 
being retained in spliced RNA, exons being excised, or changes in the 
location of the splice site. 


Review Questions 


Exercise: 


Problem: 


Which pre-mRNA processing step is important for initiating 
translation? 


a. poly-A tail 

b. RNA editing 

c. splicing 

d. 7-methylguanosine cap 


Solution: 


D 
Exercise: 


Problem: 


What processing step enhances the stability of pre-tRNAs and pre- 
rRNAs? 


a. methylation 

b. nucleotide modification 
c. cleavage 

d. splicing 


Solution: 


A 


Glossary 


7-methylguanosine cap 
modification added to the 5' end of pre-mRNAs to protect mRNA from 
degradation and assist translation 


anticodon 
three-nucleotide sequence in atRNA molecule that corresponds to an 
mRNA codon 


exon 
sequence present in protein-coding mRNA after completion of pre- 
mRNA splicing 


intron 
non—protein-coding intervening sequences that are spliced from 
mRNA during processing 


poly-A tail 
modification added to the 3' end of pre-mRNAs to protect mRNA from 
degradation and assist mRNA export from the nucleus 


RNA editing 
direct alteration of one or more nucleotides in an MRNA that has 
already been synthesized 


splicing 
process of removing introns and reconnecting exons in a pre-mRNA 


Ribosomes and Protein Synthesis 
By the end of this section, you will be able to: 


e Describe the different steps in protein synthesis 
e Discuss the role of ribosomes in protein synthesis 


The synthesis of proteins consumes more of a cell’s energy than any other 
metabolic process. In turn, proteins account for more mass than any other 
component of living organisms (with the exception of water), and proteins 
perform virtually every function of a cell. The process of translation, or 
protein synthesis, involves the decoding of an MRNA message into a 
polypeptide product. Amino acids are covalently strung together by 
interlinking peptide bonds in lengths ranging from approximately 50 amino 
acid residues to more than 1,000. Each individual amino acid has an amino 
group (NH>) and a carboxyl (COOH) group. Polypeptides are formed when 
the amino group of one amino acid forms an amide (i.e., peptide) bond with 
the carboxyl group of another amino acid ([link]). This reaction is catalyzed 
by ribosomes and generates one water molecule. 


Peptide Bond 


A peptide bond links the 
carboxyl end of one amino 
acid with the amino end of 

another, expelling one water 
molecule. For simplicity in 
this image, only the 
functional groups involved in 
the peptide bond are shown. 
The R and R' designations 


refer to the rest of each amino 
acid structure. 


The Protein Synthesis Machinery 


In addition to the mRNA template, many molecules and macromolecules 
contribute to the process of translation. The composition of each component 
may vary across species; for instance, ribosomes may consist of different 
numbers of rRNAs and polypeptides depending on the organism. However, 
the general structures and functions of the protein synthesis machinery are 
comparable from bacteria to human cells. Translation requires the input of 
an mRNA template, ribosomes, tRNAs, and various enzymatic factors. 


Note: 
Link to Learning 
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Click through the steps of this PBS interactive to see protein synthesis in 
action. 


Ribosomes 


Even before an MRNA is translated, a cell must invest energy to build each 
of its ribosomes. In E. coli, there are between 10,000 and 70,000 ribosomes 
present in each cell at any given time. A ribosome is a complex 
macromolecule composed of structural and catalytic rRNAs, and many 
distinct polypeptides. In eukaryotes, the nucleolus is completely specialized 
for the synthesis and assembly of rRNAs. 


Ribosomes exist in the cytoplasm in prokaryotes and in the cytoplasm and 
rough endoplasmic reticulum in eukaryotes. Mitochondria and chloroplasts 
also have their own ribosomes in the matrix and stroma, which look more 
similar to prokaryotic ribosomes (and have similar drug sensitivities) than 
the ribosomes just outside their outer membranes in the cytoplasm. 
Ribosomes dissociate into large and small subunits when they are not 
synthesizing proteins and reassociate during the initiation of translation. In 
E. coli, the small subunit is described as 30S, and the large subunit is 50S, 
for a total of 70S (recall that Svedberg units are not additive). Mammalian 
ribosomes have a small 40S subunit and a large 60S subunit, for a total of 
80S. The small subunit is responsible for binding the mRNA template, 
whereas the large subunit sequentially binds tRNAs. Each mRNA molecule 
is simultaneously translated by many ribosomes, all synthesizing protein in 
the same direction: reading the mRNA from 5' to 3' and synthesizing the 
polypeptide from the N terminus to the C terminus. The complete 
mRNA/poly-ribosome structure is called a polysome. 


tRNAs 


The tRNAs are structural RNA molecules that were transcribed from genes 
by RNA polymerase III. Depending on the species, 40 to 60 types of tRNAs 
exist in the cytoplasm. Serving as adaptors, specific tRNAs bind to 
sequences on the mRNA template and add the corresponding amino acid to 
the polypeptide chain. Therefore, tRNAs are the molecules that actually 
“translate” the language of RNA into the language of proteins. 


Of the 64 possible mRNA codons—or triplet combinations of A, U, G, and 
C—three specify the termination of protein synthesis and 61 specify the 
addition of amino acids to the polypeptide chain. Of these 61, one codon 
(AUG) also encodes the initiation of translation. Each tRNA anticodon can 
base pair with one of the mRNA codons and add an amino acid or terminate 
translation, according to the genetic code. For instance, if the sequence 
CUA occurred on an mRNA template in the proper reading frame, it would 
bind a tRNA expressing the complementary sequence, GAU, which would 
be linked to the amino acid leucine. 


As the adaptor molecules of translation, it is surprising that tRNAs can fit 
so much specificity into such a small package. Consider that tRNAs need to 
interact with three factors: 1) they must be recognized by the correct 
aminoacyl synthetase (see below); 2) they must be recognized by 
ribosomes; and 3) they must bind to the correct sequence in MRNA. 


Aminoacyl tRNA Synthetases 


The process of pre-tRNA synthesis by RNA polymerase III only creates the 
RNA portion of the adaptor molecule. The corresponding amino acid must 
be added later, once the tRNA is processed and exported to the cytoplasm. 
Through the process of tRNA “charging,” each tRNA molecule is linked to 
its correct amino acid by a group of enzymes called aminoacyl tRNA 
synthetases. At least one type of aminoacyl tRNA synthetase exists for 
each of the 20 amino acids; the exact number of aminoacyl tRNA 
synthetases varies by species. These enzymes first bind and hydrolyze ATP 
to catalyze a high-energy bond between an amino acid and adenosine 
monophosphate (AMP); a pyrophosphate molecule is expelled in this 
reaction. The activated amino acid is then transferred to the tRNA, and 
AMP is released. 


The Mechanism of Protein Synthesis 


As with mRNA synthesis, protein synthesis can be divided into three 
phases: initiation, elongation, and termination. The process of translation is 
similar in prokaryotes and eukaryotes. Here we’ll explore how translation 
occurs in E. coli, a representative prokaryote, and specify any differences 
between prokaryotic and eukaryotic translation. 


Initiation of Translation 


Protein synthesis begins with the formation of an initiation complex. In E. 
coli, this complex involves the small 30S ribosome, the mRNA template, 
three initiation factors (IFs; IF-1, IF-2, and IF-3), and a special initiator 


tRNA, called tRNA}. The initiator tRNA interacts with the start codon 
AUG (or rarely, GUG), links to a formylated methionine called fMet, and 
can also bind IF-2. Formylated methionine is inserted by {Met — tRNA; 
at the beginning of every polypeptide chain synthesized by E. coli, but it is 
usually clipped off after translation is complete. When an in-frame AUG is 
encountered during translation elongation, a non-formylated methionine is 
inserted by a regular Met-tRNAM*. 


In E. coli mRNA, a sequence upstream of the first AUG codon, called the 
Shine-Dalgarno sequence (AGGAGG), interacts with the rRNA molecules 
that compose the ribosome. This interaction anchors the 30S ribosomal 
subunit at the correct location on the mRNA template. Guanosine 
triphosphate (GTP), which is a purine nucleotide triphosphate, acts as an 
energy source during translation—both at the start of elongation and during 
the ribosome’s translocation. 


In eukaryotes, a similar initiation complex forms, comprising mRNA, the 
AOS small ribosomal subunit, IFs, and nucleoside triphosphates (GTP and 
ATP). The charged initiator tRNA, called Met-tRNAj, does not bind fMet in 
eukaryotes, but is distinct from other Met-tRNAs in that it can bind IFs. 


Instead of depositing at the Shine-Dalgarno sequence, the eukaryotic 
initiation complex recognizes the 7-methylguanosine cap at the 5' end of the 
mRNA. A cap-binding protein (CBP) and several other IFs assist the 
movement of the ribosome to the 5' cap. Once at the cap, the initiation 
complex tracks along the mRNA in the 5' to 3' direction, searching for the 
AUG start codon. Many eukaryotic mRNAs are translated from the first 
AUG, but this is not always the case. According to Kozak’s rules, the 
nucleotides around the AUG indicate whether it is the correct start codon. 
Kozak’s rules state that the following consensus sequence must appear 
around the AUG of vertebrate genes: 5'-gccRccAUGG-3'. The R (for 
purine) indicates a site that can be either A or G, but cannot be C or U. 
Essentially, the closer the sequence is to this consensus, the higher the 
efficiency of translation. 


Once the appropriate AUG is identified, the other proteins and CBP 
dissociate, and the 60S subunit binds to the complex of Met-tRNA;, mRNA, 


and the 40S subunit. This step completes the initiation of translation in 
eukaryotes. 


Translation, Elongation, and Termination 


In prokaryotes and eukaryotes, the basics of elongation are the same, so we 
will review elongation from the perspective of E. coli. The 50S ribosomal 
subunit of E. coli consists of three compartments: the A (aminoacy]l) site 
binds incoming charged aminoacyl tRNAs. The P (peptidyl) site binds 
charged tRNAs carrying amino acids that have formed peptide bonds with 
the growing polypeptide chain but have not yet dissociated from their 
corresponding tRNA. The E (exit) site releases dissociated tRNAs so that 
they can be recharged with free amino acids. There is one exception to this 
assembly line of tRNAs: in E. coli, {Met — tRNA}" is capable of 
entering the P site directly without first entering the A site. Similarly, the 
eukaryotic Met-tRNA,, with help from other proteins of the initiation 
complex, binds directly to the P site. In both cases, this creates an initiation 
complex with a free A site ready to accept the tRNA corresponding to the 
first codon after the AUG. 


During translation elongation, the mRNA template provides specificity. As 
the ribosome moves along the mRNA, each mRNA codon comes into 
register, and specific binding with the corresponding charged tRNA 
anticodon is ensured. If mRNA were not present in the elongation complex, 
the ribosome would bind tRNAs nonspecifically. 


Elongation proceeds with charged tRNAs entering the A site and then 
shifting to the P site followed by the E site with each single-codon “step” of 
the ribosome. Ribosomal steps are induced by conformational changes that 
advance the ribosome by three bases in the 3' direction. The energy for each 
step of the ribosome is donated by an elongation factor that hydrolyzes 
GTP. Peptide bonds form between the amino group of the amino acid 
attached to the A-site tRNA and the carboxyl group of the amino acid 
attached to the P-site tRNA. The formation of each peptide bond is 
catalyzed by peptidyl transferase, an RNA-based enzyme that is integrated 
into the 50S ribosomal subunit. The energy for each peptide bond formation 


is derived from GTP hydrolysis, which is catalyzed by a separate elongation 
factor. The amino acid bound to the P-site tRNA is also linked to the 
growing polypeptide chain. As the ribosome steps across the mRNA, the 
former P-site tRNA enters the E site, detaches from the amino acid, and is 
expelled ({link]). Amazingly, the E. coli translation apparatus takes only 
0.05 seconds to add each amino acid, meaning that a 200-amino acid 
protein can be translated in just 10 seconds. 


Note: 
Art Connection 


Large ribosomal 
subunit 


Small ribosomal {) 
subunit 


Polypeptide chain 


Translation begins when 
an initiator tRNA 
anticodon recognizes a 
codon on mRNA. The 
large ribosomal subunit 


joins the small subunit, 
and a second tRNA is 
recruited. As the mRNA 
moves relative to the 
ribosome, the polypeptide 
chain is formed. Entry of 
a release factor into the A 
site terminates translation 
and the components 
dissociate. 


Many antibiotics inhibit bacterial protein synthesis. For example, 
tetracycline blocks the A site on the bacterial ribosome, and 
chloramphenicol blocks peptidyl transfer. What specific effect would you 
expect each of these antibiotics to have on protein synthesis? 
Tetracycline would directly affect: 


a. (RNA binding to the ribosome 
b. ribosome assembly 
c. growth of the protein chain 


Chloramphenicol would directly affect 


a. tRNA binding to the ribosome 
b. ribosome assembly 
c. growth of the protein chain 


Termination of translation occurs when a nonsense codon (UAA, UAG, or 
UGA) is encountered. Upon aligning with the A site, these nonsense codons 
are recognized by release factors in prokaryotes and eukaryotes that instruct 
peptidy] transferase to add a water molecule to the carboxyl end of the P- 
site amino acid. This reaction forces the P-site amino acid to detach from its 
tRNA, and the newly made protein is released. The small and large 
ribosomal subunits dissociate from the mRNA and from each other; they 


are recruited almost immediately into another translation initiation complex. 
After many ribosomes have completed translation, the mRNA is degraded 
so the nucleotides can be reused in another transcription reaction. 


Protein Folding, Modification, and Targeting 


During and after translation, individual amino acids may be chemically 
modified, signal sequences may be appended, and the new protein “folds” 
into a distinct three-dimensional structure as a result of intramolecular 
interactions. A signal sequence is a short tail of amino acids that directs a 
protein to a specific cellular compartment. These sequences at the amino 
end or the carboxyl end of the protein can be thought of as the protein’s 
“train ticket” to its ultimate destination. Other cellular factors recognize 
each signal sequence and help transport the protein from the cytoplasm to 
its correct compartment. For instance, a specific sequence at the amino 
terminus will direct a protein to the mitochondria or chloroplasts (in plants). 
Once the protein reaches its cellular destination, the signal sequence is 
usually clipped off. 


Many proteins fold spontaneously, but some proteins require helper 
molecules, called chaperones, to prevent them from aggregating during the 
complicated process of folding. Even if a protein is properly specified by its 
corresponding mRNA, it could take on a completely dysfunctional shape if 
abnormal temperature or pH conditions prevent it from folding correctly. 


Section Summary 


The players in translation include the mRNA template, ribosomes, tRNAs, 
and various enzymatic factors. The small ribosomal subunit forms on the 
mRNA template either at the Shine-Dalgarno sequence (prokaryotes) or the 
5' cap (eukaryotes). Translation begins at the initiating AUG on the mRNA, 
specifying methionine. The formation of peptide bonds occurs between 
sequential amino acids specified by the mRNA template according to the 
genetic code. Charged tRNAs enter the ribosomal A site, and their amino 
acid bonds with the amino acid at the P site. The entire mRNA is translated 
in three-nucleotide “steps” of the ribosome. When a nonsense codon is 
encountered, a release factor binds and dissociates the components and 


frees the new protein. Folding of the protein occurs during and after 
translation. 


Art Connections 


Exercise: 


Problem: 
[link] Many antibiotics inhibit bacterial protein synthesis. For 
example, tetracycline blocks the A site on the bacterial ribosome, and 


chloramphenicol blocks peptidyl transfer. What specific effect would 
you expect each of these antibiotics to have on protein synthesis? 


Tetracycline would directly affect: 


a. tRNA binding to the ribosome 
b. ribosome assembly 
c. growth of the protein chain 


Chloramphenicol would directly affect 


a. tRNA binding to the ribosome 
b. ribosome assembly 
c. growth of the protein chain 


Solution: 


[link] Tetracycline: a; Chloramphenicol: c. 


Review Questions 


Exercise: 


Problem: 


The RNA components of ribosomes are synthesized in the 


a. cytoplasm 

b. nucleus 

c. nucleolus 

d. endoplasmic reticulum 


Solution: 


C 
Exercise: 


Problem: 


In any given species, there are at least how many types of aminoacyl 
tRNA synthetases? 


a. 20 
b. 40 
c. 100 
d. 200 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


Transcribe and translate the following DNA sequence (nontemplate 
strand): 5'-ATGGCCGGTTATTAAGCA-3' 


Solution: 


The mRNA would be: 5'-AUGGCCGGUUAUUAAGCA-3"'. The 
protein would be: MAGY. Even though there are six codons, the fifth 
codon corresponds to a stop, so the sixth codon would not be 
translated. 


Exercise: 


Problem: 


Explain how single nucleotide changes can have vastly different 
effects on protein function. 


Solution: 


Nucleotide changes in the third position of codons may not change the 
amino acid and would have no effect on the protein. Other nucleotide 
changes that change important amino acids or create or delete start or 
stop codons would have severe effects on the amino acid sequence of 
the protein. 


Glossary 


aminoacyl tRNA synthetase 
enzyme that “charges” tRNA molecules by catalyzing a bond between 
the tRNA and a corresponding amino acid 


initiator tRNA 
in prokaryotes, called tRN Ay et - in eukaryotes, called tRNAj; a 


tRNA that interacts with a start codon, binds directly to the ribosome P 
site, and links to a special methionine to begin a polypeptide chain 


Kozak’s rules 
determines the correct initiation AUG in a eukaryotic MRNA; the 
following consensus sequence must appear around the AUG: 5’- 
GCC(purine)CCAUGG-3’; the bolded bases are most important 


peptidy] transferase 


RNA-based enzyme that is integrated into the 50S ribosomal subunit 
and catalyzes the formation of peptide bonds 


polysome 
mRNA molecule simultaneously being translated by many ribosomes 
all going in the same direction 


Shine-Dalgarno sequence 
(AGGAGG)); initiates prokaryotic translation by interacting with rRNA 
molecules comprising the 30S ribosome 


signal sequence 
short tail of amino acids that directs a protein to a specific cellular 
compartment 


start codon 
AUG (or rarely, GUG) on an mRNA from which translation begins; 
always specifies methionine 


Introduction 
class="introduction' 


Living 
things may 
be single- 
celled or 
complex, 
multicellular 
organisms. 
They may be 
plants, 
animals, 
fungi, 
bacteria, or 
archaea. This 
diversity 
results from 
evolution. 
(credit 
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modification 
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modification 
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Cory 
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All life on Earth is related. Evolutionary theory states that humans, beetles, 
plants, and bacteria all share a common ancestor, but that millions of years 
of evolution have shaped each of these organisms into the forms seen today. 
Scientists consider evolution a key concept to understanding life. Natural 
selection is one of the most dominant evolutionary forces. Natural selection 
acts to promote traits and behaviors that increase an organism’s chances of 
survival and reproduction, while eliminating those traits and behaviors that 
are to the organism’s detriment. But natural selection can only, as its name 
implies, select—it cannot create. The introduction of novel traits and 
behaviors falls on the shoulders of another evolutionary force—mutation. 
Mutation and other sources of variation among individuals, as well as the 
evolutionary forces that act upon them, alter populations and species. This 
combination of processes has led to the world of life we see today. 


Population Evolution 
By the end of this section, you will be able to: 


¢ Define population genetics and describe how population genetics is 
used in the study of the evolution of populations 
e Define the Hardy-Weinberg principle and discuss its importance 


The mechanisms of inheritance, or genetics, were not understood at the time 
Charles Darwin and Alfred Russel Wallace were developing their idea of 
natural selection. This lack of understanding was a stumbling block to 
understanding many aspects of evolution. In fact, the predominant (and 
incorrect) genetic theory of the time, blending inheritance, made it difficult 
to understand how natural selection might operate. Darwin and Wallace 
were unaware of the genetics work by Austrian monk Gregor Mendel, 
which was published in 1866, not long after publication of Darwin's book, 
On the Origin of Species. Mendel’s work was rediscovered in the early 
twentieth century at which time geneticists were rapidly coming to an 
understanding of the basics of inheritance. Initially, the newly discovered 
particulate nature of genes made it difficult for biologists to understand how 
gradual evolution could occur. But over the next few decades genetics and 
evolution were integrated in what became known as the modern synthesis 
—the coherent understanding of the relationship between natural selection 
and genetics that took shape by the 1940s and is generally accepted today. 
In sum, the modern synthesis describes how evolutionary processes, such as 
natural selection, can affect a population’s genetic makeup, and, in turn, 
how this can result in the gradual evolution of populations and species. The 
theory also connects this change of a population over time, called 
microevolution, with the processes that gave rise to new species and higher 
taxonomic groups with widely divergent characters, called 
macroevolution. 


Note: 

Everyday Connection 

Evolution and Flu Vaccines 

Every fall, the media starts reporting on flu vaccinations and potential 
outbreaks. Scientists, health experts, and institutions determine 


recommendations for different parts of the population, predict optimal 
production and inoculation schedules, create vaccines, and set up clinics to 
provide inoculations. You may think of the annual flu shot as a lot of media 
hype, an important health protection, or just a briefly uncomfortable prick 
in your arm. But do you think of it in terms of evolution? 

The media hype of annual flu shots is scientifically grounded in our 
understanding of evolution. Each year, scientists across the globe strive to 
predict the flu strains that they anticipate being most widespread and 
harmful in the coming year. This knowledge is based in how flu strains 
have evolved over time and over the past few flu seasons. Scientists then 
work to create the most effective vaccine to combat those selected strains. 
Hundreds of millions of doses are produced in a short period in order to 
provide vaccinations to key populations at the optimal time. 

Because viruses, like the flu, evolve very quickly (especially in 
evolutionary time), this poses quite a challenge. Viruses mutate and 
replicate at a fast rate, so the vaccine developed to protect against last 
year’s flu strain may not provide the protection needed against the coming 
year’s strain. Evolution of these viruses means continued adaptions to 
ensure survival, including adaptations to survive previous vaccines. 


Population Genetics 


Recall that a gene for a particular character may have several alleles, or 
variants, that code for different traits associated with that character. For 
example, in the ABO blood type system in humans, three alleles determine 
the particular blood-type carbohydrate on the surface of red blood cells. 
Each individual in a population of diploid organisms can only carry two 
alleles for a particular gene, but more than two may be present in the 
individuals that make up the population. Mendel followed alleles as they 
were inherited from parent to offspring. In the early twentieth century, 
biologists in a field of study known as population genetics began to study 
how selective forces change a population through changes in allele and 
genotypic frequencies. 


The allele frequency (or gene frequency) is the rate at which a specific 
allele appears within a population. Until now we have discussed evolution 
as a change in the characteristics of a population of organisms, but behind 
that phenotypic change is genetic change. In population genetics, the term 
evolution is defined as a change in the frequency of an allele in a 
population. Using the ABO blood type system as an example, the frequency 
of one of the alleles, I“, is the number of copies of that allele divided by all 
the copies of the ABO gene in the population. For example, a study in 
Jordan!2mote] found a frequency of I“ to be 26.1 percent. The I® and [° 
alleles made up 13.4 percent and 60.5 percent of the alleles respectively, 
and all of the frequencies added up to 100 percent. A change in this 
frequency over time would constitute evolution in the population. 

Sahar S. Hanania, Dhia S. Hassawi, and Nidal M. Irshaid, “Allele 
Frequency and Molecular Genotypes of ABO Blood Group System in a 
Jordanian Population,” Journal of Medical Sciences 7 (2007): 51-58, 

doi: 10.3923/jms.2007.51.58. 


The allele frequency within a given population can change depending on 
environmental factors; therefore, certain alleles become more widespread 
than others during the process of natural selection. Natural selection can 
alter the population’s genetic makeup; for example, if a given allele confers 
a phenotype that allows an individual to better survive or have more 
offspring. Because many of those offspring will also carry the beneficial 
allele, and often the corresponding phenotype, they will have more 
offspring of their own that also carry the allele, thus, perpetuating the cycle. 
Over time, the allele will spread throughout the population. Some alleles 
will quickly become fixed in this way, meaning that every individual of the 
population will carry the allele, while detrimental mutations may be swiftly 
eliminated if derived from a dominant allele from the gene pool. The gene 
pool is the sum of all the alleles in a population. 


Sometimes, allele frequencies within a population change randomly with no 
advantage to the population over existing allele frequencies. This 
phenomenon is called genetic drift. Natural selection and genetic drift 
usually occur simultaneously in populations and are not isolated events. It is 
hard to determine which process dominates because it is often nearly 
impossible to determine the cause of change in allele frequencies at each 


occurrence. An event that initiates an allele frequency change in an isolated 
part of the population, which is not typical of the original population, is 
called the founder effect. Natural selection, random drift, and founder 
effects can lead to significant changes in the genome of a population. 


Hardy-Weinberg Principle of Equilibrium 


In the early twentieth century, English mathematician Godfrey Hardy and 
German physician Wilhelm Weinberg stated the principle of equilibrium to 
describe the genetic makeup of a population. The theory, which later 
became known as the Hardy-Weinberg principle of equilibrium, states that a 
population’s allele and genotype frequencies are inherently stable— unless 
some kind of evolutionary force is acting upon the population, neither the 
allele nor the genotypic frequencies would change. The Hardy-Weinberg 
principle assumes conditions with no mutations, migration, emigration, or 
selective pressure for or against genotype, plus an infinite population; while 
no population can satisfy those conditions, the principle offers a useful 
model against which to compare real population changes. 


Working under this theory, population geneticists represent different alleles 
as different variables in their mathematical models. The variable p, for 
example, often represents the frequency of a particular allele, say Y for the 
trait of yellow in Mendel’s peas, while the variable q represents the 
frequency of y alleles that confer the color green. If these are the only two 
possible alleles for a given locus in the population, p + gq = 1. In other 
words, all the p alleles and all the g alleles make up all of the alleles for that 
locus that are found in the population. 


But what ultimately interests most biologists is not the frequencies of 
different alleles, but the frequencies of the resulting genotypes, known as 
the population’s genetic structure, from which scientists can surmise the 
distribution of phenotypes. If the phenotype is observed, only the genotype 
of the homozygous recessive alleles can be known; the calculations provide 
an estimate of the remaining genotypes. Since each individual carries two 
alleles per gene, if the allele frequencies (p and q) are known, predicting the 
frequencies of these genotypes is a simple mathematical calculation to 
determine the probability of getting these genotypes if two alleles are drawn 


at random from the gene pool. So in the above scenario, an individual pea 
plant could be pp (YY), and thus produce yellow peas; pq (Yy), also 
yellow; or qq (yy), and thus producing green peas ([link]). In other words, 
the frequency of pp individuals is simply p?; the frequency of pq 
individuals is 2pq; and the frequency of qq individuals is q?. And, again, if 
p and q are the only two possible alleles for a given trait in the population, 
these genotypes frequencies will sum to one: p* + 2pq + q? = 1. 


Note: 
Art Connection 


Parent generation 


Phenotype yy) 


Genotypic frequency 49 


Number of individuals 
(total = 500) 


Number of alleles Y: 490 + 210 = 700 y: 210 + 90 = 300 
in gene pool 
(total = 1000) gy gy 
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of YY of Yy of yy 
offspring offspring offspring 


When populations are in the Hardy-Weinberg 
equilibrium, the allelic frequency is stable from 
generation to generation and the distribution of 

alleles can be determined from the Hardy- 
Weinberg equation. If the allelic frequency 
measured in the field differs from the predicted 
value, scientists can make inferences about what 
evolutionary forces are at play. 


In plants, violet flower color (V) is dominant over white (v). If p = 0.8 and 
q = 0.2 ina population of 500 plants, how many individuals would you 
expect to be homozygous dominant (VV), heterozygous (Vv), and 
homozygous recessive (vv)? How many plants would you expect to have 
violet flowers, and how many would have white flowers? 


In theory, if a population is at equilibrium—that is, there are no 
evolutionary forces acting upon it—generation after generation would have 
the same gene pool and genetic structure, and these equations would all 
hold true all of the time. Of course, even Hardy and Weinberg recognized 
that no natural population is immune to evolution. Populations in nature are 
constantly changing in genetic makeup due to drift, mutation, possibly 
migration, and selection. As a result, the only way to determine the exact 
distribution of phenotypes in a population is to go out and count them. But 
the Hardy-Weinberg principle gives scientists a mathematical baseline of a 
non-evolving population to which they can compare evolving populations 
and thereby infer what evolutionary forces might be at play. If the 
frequencies of alleles or genotypes deviate from the value expected from 
the Hardy-Weinberg equation, then the population is evolving. 


Note: 
Link to Learning 
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Use this online calculator to determine the genetic structure of a 
population. 


Section Summary 


The modern synthesis of evolutionary theory grew out of the cohesion of 
Darwin’s, Wallace’s, and Mendel’s thoughts on evolution and heredity, 
along with the more modern study of population genetics. It describes the 
evolution of populations and species, from small-scale changes among 
individuals to large-scale changes over paleontological time periods. To 
understand how organisms evolve, scientists can track populations’ allele 
frequencies over time. If they differ from generation to generation, 
scientists can conclude that the population is not in Hardy-Weinberg 
equilibrium, and is thus evolving. 


Art Connections 


Exercise: 


Problem: 


[link] In plants, violet flower color (V) is dominant over white (v). If 
p=.8 and q = 0.2 in a population of 500 plants, how many individuals 
would you expect to be homozygous dominant (VV), heterozygous 
(Vv), and homozygous recessive (vv)? How many plants would you 
expect to have violet flowers, and how many would have white 
flowers? 


Solution: 


[link] The expected distribution is 320 VV, 160Vv, and 20 vv plants. 
Plants with VV or Vv genotypes would have violet flowers, and plants 
with the vv genotype would have white flowers, so a total of 480 
plants would be expected to have violet flowers, and 20 plants would 
have white flowers. 


Review Questions 


Exercise: 


Problem: What is the difference between micro- and macroevolution? 


a. Microevolution describes the evolution of small organisms, such 
as insects, while macroevolution describes the evolution of large 
organisms, like people and elephants. 

b. Microevolution describes the evolution of microscopic entities, 
such as molecules and proteins, while macroevolution describes 
the evolution of whole organisms. 

c. Microevolution describes the evolution of organisms in 
populations, while macroevolution describes the evolution of 
species over long periods of time. 

d. Microevolution describes the evolution of organisms over their 
lifetimes, while macroevolution describes the evolution of 
organisms over multiple generations. 


Solution: 


& 


Exercise: 


Problem: Population genetics is the study of: 


a. how selective forces change the allele frequencies in a population 
over time 
b. the genetic basis of population-wide traits 


c. whether traits have a genetic basis 
d. the degree of inbreeding in a population 


Solution: 


A 
Exercise: 
Problem: 


Which of the following populations is not in Hardy-Weinberg 
equilibrium? 


a. a population with 12 homozygous recessive individuals (yy), 8 
homozygous dominant individuals (YY), and 4 heterozygous 
individuals (Yy) 

b. a population in which the allele frequencies do not change over 
time 

2 ee 
cp’ + 2pq+ qe = 1 
d. a population undergoing natural selection 


Solution: 


D 
Exercise: 


Problem: 


One of the original Amish colonies rose from a ship of colonists that 
came from Europe. The ship’s captain, who had polydactyly, a rare 


dominant trait, was one of the original colonists. Today, we see a much 


higher frequency of polydactyly in the Amish population. This is an 
example of: 


a. natural selection 
b. genetic drift 


c. founder effect 
d.bandc 


Solution: 


D 


Free Response 


Exercise: 
Problem: 
Solve for the genetic structure of a population with 12 homozygous 


recessive individuals (yy), 8 homozygous dominant individuals (YY), 
and 4 heterozygous individuals (Yy). 


Solution: 
p = (8*2 + 4)/48 = .42; q = (12*2 + 4)/48 = .58; p2 = .17; 2pq = .48; ? 
= 34 


Exercise: 


Problem:Explain the Hardy-Weinberg principle of equilibrium theory. 


Solution: 


The Hardy-Weinberg principle of equilibrium is used to describe the 
genetic makeup of a population. The theory states that a population’s 
allele and genotype frequencies are inherently stable: unless some kind 
of evolutionary force is acting upon the population, generation after 
generation of the population would carry the same genes, and 
individuals would, as a whole, look essentially the same. 


Exercise: 


Problem: 


Imagine you are trying to test whether a population of flowers is 
undergoing evolution. You suspect there is selection pressure on the 
color of the flower: bees seem to cluster around the red flowers more 
often than the blue flowers. In a separate experiment, you discover 
blue flower color is dominant to red flower color. In a field, you count 
600 blue flowers and 200 red flowers. What would you expect the 
genetic structure of the flowers to be? 


Solution: 


Red is recessive so q2 = 200/800 = 0.25; gq = 0.5; p = 1-q = 0.5; p2 = 
0.25; 2pq = 0.5. You would expect 200 homozygous blue flowers, 400 
heterozygous blue flowers, and 200 red flowers. 


Glossary 


allele frequency 
(also, gene frequency) rate at which a specific allele appears within a 
population 


founder effect 
event that initiates an allele frequency change in part of the population, 
which is not typical of the original population 


gene pool 
all of the alleles carried by all of the individuals in the population 


genetic structure 
distribution of the different possible genotypes in a population 


macroevolution 
broader scale evolutionary changes seen over paleontological time 


microevolution 
changes in a population’s genetic structure 


modern synthesis 
overarching evolutionary paradigm that took shape by the 1940s and is 
generally accepted today 


population genetics 
study of how selective forces change the allele frequencies in a 
population over time 


Population Genetics 
By the end of this section, you will be able to: 


e Describe the different types of variation in a population 

e Explain why only heritable variation can be acted upon by natural 
selection 

e Describe genetic drift and the bottleneck effect 

e Explain how each evolutionary force can influence the allele 
frequencies of a population 


Individuals of a population often display different phenotypes, or express 
different alleles of a particular gene, referred to as polymorphisms. 
Populations with two or more variations of particular characteristics are 
called polymorphic. The distribution of phenotypes among individuals, 
known as the population variation, is influenced by a number of factors, 
including the population’s genetic structure and the environment ([link]). 
Understanding the sources of a phenotypic variation in a population is 
important for determining how a population will evolve in response to 
different evolutionary pressures. 
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The distribution of phenotypes in this 
litter of kittens illustrates population 
variation. (credit: Pieter Lanser) 


Genetic Variance 


Natural selection and some of the other evolutionary forces can only act on 
heritable traits, namely an organism’s genetic code. Because alleles are 
passed from parent to offspring, those that confer beneficial traits or 
behaviors may be selected for, while deleterious alleles may be selected 
against. Acquired traits, for the most part, are not heritable. For example, if 
an athlete works out in the gym every day, building up muscle strength, the 
athlete’s offspring will not necessarily grow up to be a body builder. If there 
is a genetic basis for the ability to run fast, on the other hand, this may be 
passed to a child. 


Note: 
Link to Learning 
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OR 


Before Darwinian evolution became the prevailing theory of the field, 
French naturalist Jean-Baptiste Lamarck theorized that acquired traits 
could, in fact, be inherited; while this hypothesis has largely been 
unsupported, scientists have recently begun to realize that Lamarck was 
not completely wrong. Visit this site to learn more. 


Heritability is the fraction of phenotype variation that can be attributed to 
genetic differences, or genetic variance, among individuals in a population. 
The greater the hereditability of a population’s phenotypic variation, the 
more susceptible it is to the evolutionary forces that act on heritable 
variation. 


The diversity of alleles and genotypes within a population is called genetic 
variance. When scientists are involved in the breeding of a species, such as 
with animals in zoos and nature preserves, they try to increase a 


population’s genetic variance to preserve as much of the phenotypic 
diversity as they can. This also helps reduce the risks associated with 
inbreeding, the mating of closely related individuals, which can have the 
undesirable effect of bringing together deleterious recessive mutations that 
can cause abnormalities and susceptibility to disease. For example, a 
disease that is caused by a rare, recessive allele might exist in a population, 
but it will only manifest itself when an individual carries two copies of the 
allele. Because the allele is rare in a normal, healthy population with 
unrestricted habitat, the chance that two carriers will mate is low, and even 
then, only 25 percent of their offspring will inherit the disease allele from 
both parents. While it is likely to happen at some point, it will not happen 
frequently enough for natural selection to be able to swiftly eliminate the 
allele from the population, and as a result, the allele will be maintained at 
low levels in the gene pool. However, if a family of carriers begins to 
interbreed with each other, this will dramatically increase the likelihood of 
two carriers mating and eventually producing diseased offspring, a 
phenomenon known as inbreeding depression. 


Changes in allele frequencies that are identified in a population can shed 
light on how it is evolving. In addition to natural selection, there are other 
evolutionary forces that could be in play: genetic drift, gene flow, mutation, 
nonrandom mating, and environmental variances. 


Genetic Drift 


The theory of natural selection stems from the observation that some 
individuals in a population are more likely to survive longer and have more 
offspring than others; thus, they will pass on more of their genes to the next 
generation. A big, powerful male gorilla, for example, is much more likely 
than a smaller, weaker one to become the population’s silverback, the 
pack’s leader who mates far more than the other males of the group. The 
pack leader will father more offspring, who share half of his genes, and are 
likely to also grow bigger and stronger like their father. Over time, the 
genes for bigger size will increase in frequency in the population, and the 
population will, as a result, grow larger on average. That is, this would 
occur if this particular selection pressure, or driving selective force, were 


the only one acting on the population. In other examples, better camouflage 
or a stronger resistance to drought might pose a selection pressure. 


Another way a population’s allele and genotype frequencies can change is 
genetic drift ({link]), which is simply the effect of chance. By chance, 
some individuals will have more offspring than others—not due to an 
advantage conferred by some genetically-encoded trait, but just because one 
male happened to be in the right place at the right time (when the receptive 
female walked by) or because the other one happened to be in the wrong 
place at the wrong time (when a fox was hunting). 


Note: 
Art Connection 


Genetic Drift 


First generation 
p (B gene frequency) = .5 
q (b gene frequency) = .5 


Genetic drift in a population 
can lead to the elimination of 
an allele from a population 
by chance. In this example, 
rabbits with the brown coat 
color allele (B) are dominant 
over rabbits with the white 
coat color allele (b). In the 


first generation, the two 
alleles occur with equal 
frequency in the population, 
resulting in p and q values of 
.o. Only half of the 
individuals reproduce, 
resulting in a second 
generation with p and q 
values of .7 and .3, 
respectively. Only two 
individuals in the second 
generation reproduce, and by 
chance these individuals are 
homozygous dominant for 
brown coat color. As a result, 
in the third generation the 
recessive b allele is lost. 


Do you think genetic drift would happen more quickly on an island or on 
the mainland? 


Small populations are more susceptible to the forces of genetic drift. Large 
populations, on the other hand, are buffered against the effects of chance. If 
one individual of a population of 10 individuals happens to die at a young 
age before it leaves any offspring to the next generation, all of its genes— 
1/10 of the population’s gene pool—will be suddenly lost. In a population 
of 100, that’s only 1 percent of the overall gene pool; therefore, it is much 
less impactful on the population’s genetic structure. 


Note: 
Link to Learning 
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Go to this site to watch an animation of random sampling and genetic drift 
in action. 


Genetic drift can also be magnified by natural events, such as a natural 
disaster that kills—at random—a large portion of the population. Known as 
the bottleneck effect, it results in a large portion of the genome suddenly 
being wiped out ((link]). In one fell swoop, the genetic structure of the 
survivors becomes the genetic structure of the entire population, which may 
be very different from the pre-disaster population. 


Original 
population 
/ PrZZ, 

Bottlenecking ——————= 
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A chance event or catastrophe 
can reduce the genetic 
variability within a population. 


Another scenario in which populations might experience a strong influence 
of genetic drift is if some portion of the population leaves to start a new 
population in a new location or if a population gets divided by a physical 
barrier of some kind. In this situation, those individuals are unlikely to be 
representative of the entire population, which results in the founder effect. 
The founder effect occurs when the genetic structure changes to match that 
of the new population’s founding fathers and mothers. The founder effect is 
believed to have been a key factor in the genetic history of the Afrikaner 
population of Dutch settlers in South Africa, as evidenced by mutations that 
are common in Afrikaners but rare in most other populations. This is likely 
due to the fact that a higher-than-normal proportion of the founding 
colonists carried these mutations. As a result, the population expresses 
unusually high incidences of Huntington’s disease (HD) and Fanconi 
anemia (FA), a genetic disorder known to cause blood marrow and 
congenital abnormalities—even cancer, footnote] 

A. J. Tipping et al., “Molecular and Genealogical Evidence for a Founder 
Effect in Fanconi Anemia Families of the Afrikaner Population of South 
Africa,” PNAS 98, no. 10 (2001): 5734-5739, doi: 
10.1073/pnas.091402398. 


Note: 
Link to Learning 


Watch this short video to learn more about the founder and bottleneck 
effects. 
https://www.openstaxcollege.org/l/founder_ bottle 


Note: 

Scientific Method Connection 

Testing the Bottleneck Effect 

Question: How do natural disasters affect the genetic structure of a 
population? 

Background: When much of a population is suddenly wiped out by an 
earthquake or hurricane, the individuals that survive the event are usually a 
random sampling of the original group. As a result, the genetic makeup of 
the population can change dramatically. This phenomenon is known as the 
bottleneck effect. 

Hypothesis: Repeated natural disasters will yield different population 
genetic structures; therefore, each time this experiment is run, the results 
will vary. 

Test the hypothesis: Count out the original population using different 
colored beads. For example, red, blue, and yellow beads might represent 
red, blue, and yellow individuals. After recording the number of each 
individual in the original population, place them all in a bottle with a 
narrow neck that will only allow a few beads out at a time. Then, pour 1/3 
of the bottle’s contents into a bowl. This represents the surviving 
individuals after a natural disaster kills a majority of the population. Count 
the number of the different colored beads in the bowl, and record it. Then, 
place all of the beads back in the bottle and repeat the experiment four 
more times. 

Analyze the data: Compare the five populations that resulted from the 
experiment. Do the populations all contain the same number of different 
colored beads, or do they vary? Remember, these populations all came 
from the same exact parent population. 

Form a conclusion: Most likely, the five resulting populations will differ 
quite dramatically. This is because natural disasters are not selective—they 
kill and spare individuals at random. Now think about how this might 
affect a real population. What happens when a hurricane hits the 
Mississippi Gulf Coast? How do the seabirds that live on the beach fare? 


Gene Flow 


Another important evolutionary force is gene flow: the flow of alleles in 
and out of a population due to the migration of individuals or gametes 
({link]). While some populations are fairly stable, others experience more 
flux. Many plants, for example, send their pollen far and wide, by wind or 
by bird, to pollinate other populations of the same species some distance 
away. Even a population that may initially appear to be stable, such as a 
pride of lions, can experience its fair share of immigration and emigration 
as developing males leave their mothers to seek out a new pride with 
genetically unrelated females. This variable flow of individuals in and out 
of the group not only changes the gene structure of the population, but it 
can also introduce new genetic variation to populations in different 
geological locations and habitats. 


Gene flow can occur when an individual 
travels from one geographic location to 
another. 


Mutation 


Mutations are changes to an organism’s DNA and are an important driver of 
diversity in populations. Species evolve because of the accumulation of 
mutations that occur over time. The appearance of new mutations is the 
most common way to introduce novel genotypic and phenotypic variance. 
Some mutations are unfavorable or harmful and are quickly eliminated 


from the population by natural selection. Others are beneficial and will 
spread through the population. Whether or not a mutation is beneficial or 
harmful is determined by whether it helps an organism survive to sexual 
maturity and reproduce. Some mutations do not do anything and can linger, 
unaffected by natural selection, in the genome. Some can have a dramatic 
effect on a gene and the resulting phenotype. 


Nonrandom Mating 


If individuals nonrandomly mate with their peers, the result can be a 
changing population. There are many reasons nonrandom mating occurs. 
One reason is simple mate choice; for example, female peahens may prefer 
peacocks with bigger, brighter tails. Traits that lead to more matings for an 
individual become selected for by natural selection. One common form of 
mate choice, called assortative mating, is an individual’s preference to 
mate with partners who are phenotypically similar to themselves. 


Another cause of nonrandom mating is physical location. This is especially 
true in large populations spread over large geographic distances where not 
all individuals will have equal access to one another. Some might be miles 
apart through woods or over rough terrain, while others might live 
immediately nearby. 


Environmental Variance 


Genes are not the only players involved in determining population 
variation. Phenotypes are also influenced by other factors, such as the 
environment ({link]). A beachgoer is likely to have darker skin than a city 
dweller, for example, due to regular exposure to the sun, an environmental 
factor. Some major characteristics, such as sex, are determined by the 
environment for some species. For example, some turtles and other reptiles 
have temperature-dependent sex determination (TSD). TSD means that 
individuals develop into males if their eggs are incubated within a certain 
temperature range, or females at a different temperature range. 


The sex of the American alligator 
(Alligator mississippiensis) is 
determined by the temperature at 
which the eggs are incubated. Eggs 
incubated at 30°C produce females, 
and eggs incubated at 33°C produce 
males. (credit: Steve Hillebrand, 
USFWS) 


Geographic separation between populations can lead to differences in the 
phenotypic variation between those populations. Such geographical 
variation is seen between most populations and can be significant. One 
type of geographic variation, called a cline, can be seen as populations of a 
given species vary gradually across an ecological gradient. Species of 
warm-blooded animals, for example, tend to have larger bodies in the 
cooler climates closer to the earth’s poles, allowing them to better conserve 
heat. This is considered a latitudinal cline. Alternatively, flowering plants 
tend to bloom at different times depending on where they are along the 
slope of a mountain, known as an altitudinal cline. 


If there is gene flow between the populations, the individuals will likely 
show gradual differences in phenotype along the cline. Restricted gene 
flow, on the other hand, can lead to abrupt differences, even speciation. 


Section Summary 


Both genetic and environmental factors can cause phenotypic variation in a 
population. Different alleles can confer different phenotypes, and different 
environments can also cause individuals to look or act differently. Only 
those differences encoded in an individual’s genes, however, can be passed 
to its offspring and, thus, be a target of natural selection. Natural selection 
works by selecting for alleles that confer beneficial traits or behaviors, 
while selecting against those for deleterious qualities. Genetic drift stems 
from the chance occurrence that some individuals in the germ line have 
more offspring than others. When individuals leave or join the population, 
allele frequencies can change as a result of gene flow. Mutations to an 
individual’s DNA may introduce new variation into a population. Allele 
frequencies can also be altered when individuals do not randomly mate with 
others in the group. 


Art Connections 


Exercise: 


Problem: 


[link] Do you think genetic drift would happen more quickly on an 
island or on the mainland? 


Solution: 
[link] Genetic drift is likely to occur more rapidly on an island where 
smaller populations are expected to occur. 

Review Questions 


Exercise: 


Problem: 


When male lions reach sexual maturity, they leave their group in 
search of a new pride. This can alter the allele frequencies of the 
population through which of the following mechanisms? 


a. natural selection 
b. genetic drift 

c. gene flow 

d. random mating 


Solution: 


C 
Exercise: 


Problem: 


Which of the following evolutionary forces can introduce new genetic 
variation into a population? 


a. natural selection and genetic drift 

b. mutation and gene flow 

c. natural selection and nonrandom mating 
d. mutation and genetic drift 


Solution: 
B 
Exercise: 
Problem: What is assortative mating? 


a. when individuals mate with those who are similar to themselves 


b. when individuals mate with those who are dissimilar to 
themselves 

c. when individuals mate with those who are the most fit in the 
population 

d. when individuals mate with those who are least fit in the 
population 


Solution: 


A 
Exercise: 
Problem: 
When closely related individuals mate with each other, or inbreed, the 


offspring are often not as fit as the offspring of two unrelated 
individuals. Why? 


a. Close relatives are genetically incompatible. 

b. The DNA of close relatives reacts negatively in the offspring. 

c. Inbreeding can bring together rare, deleterious mutations that lead 
to harmful phenotypes. 

d. Inbreeding causes normally silent alleles to be expressed. 


Solution: 


c 


Exercise: 


Problem: What is a cline? 


a. the slope of a mountain where a population lives 

b. the degree to which a mutation helps an individual survive 
c. the number of individuals in the population 

d. gradual geographic variation across an ecological gradient 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


Describe a situation in which a population would undergo the 
bottleneck effect and explain what impact that would have on the 
population’s gene pool. 


Solution: 


A hurricane kills a large percentage of a population of sand-dwelling 
crustaceans—only a few individuals survive. The alleles carried by 
those surviving individuals would represent the entire population’s 
gene pool. If those surviving individuals are not representative of the 
original population, the post-hurricane gene pool will differ from the 
original gene pool. 


Exercise: 


Problem: 


Describe natural selection and give an example of natural selection at 
work in a population. 


Solution: 


The theory of natural selection stems from the observation that some 
individuals in a population survive longer and have more offspring 
than others: thus, more of their genes are passed to the next generation. 
For example, a big, powerful male gorilla is much more likely than a 
smaller, weaker one to become the population’s silverback: the pack’s 
leader who mates far more than the other males of the group. 


Therefore, the pack leader will father more offspring who share half of 
his genes and are likely to grow bigger and stronger like their father. 
Over time, the genes for bigger size will increase in frequency in the 
population, and the average body size, as a result, grow larger on 
average. 


Exercise: 


Problem:Explain what a cline is and provide examples. 
Solution: 


A cline is a type of geographic variation that is seen in populations of a 
given species that vary gradually across an ecological gradient. For 
example, warm-blooded animals tend to have larger bodies in the 
cooler climates closer to the earth’s poles, allowing them to better 
conserve heat. This is considered a latitudinal cline. Flowering plants 
tend to bloom at different times depending on where they are along the 
slope of a mountain. This is known as an altitudinal cline. 


Glossary 


assortative mating 
when individuals tend to mate with those who are phenotypically 
similar to themselves 


bottleneck effect 
magnification of genetic drift as a result of natural events or 
catastrophes 


cline 
gradual geographic variation across an ecological gradient 


gene flow 
flow of alleles in and out of a population due to the migration of 
individuals or gametes 


genetic drift 
effect of chance on a population’s gene pool 


genetic variance 
diversity of alleles and genotypes in a population 


geographical variation 
differences in the phenotypic variation between populations that are 
separated geographically 


heritability 
fraction of population variation that can be attributed to its genetic 
variance 


inbreeding 
mating of closely related individuals 


inbreeding depression 
increase in abnormalities and disease in inbreeding populations 


nonrandom mating 
changes in a population’s gene pool due to mate choice or other forces 
that cause individuals to mate with certain phenotypes more than 
others 


population variation 
distribution of phenotypes in a population 


selective pressure 
environmental factor that causes one phenotype to be better than 
another 


Introduction 
class="introduction" 


The 
genetic 
content of 
each 
somatic 
cell in an 
organism 
is the 
same, but 
not all 
genes are 
expressed 
in every 
cell. The 
control of 
which 
genes are 
expressed 
dictates 
whether a 
cell is (a) 
an eye cell 
or(b)a 
liver cell. 
It is the 
differentia 
Il gene 
expression 
patterns 
that arise 
in 
different 
cells that 
give rise 


to(c)a 
complete 
organism. 


Each somatic cell in the body generally contains the same DNA. A few 
exceptions include red blood cells, which contain no DNA in their mature 
state, and some immune system cells that rearrange their DNA while 
producing antibodies. In general, however, the genes that determine 
whether you have green eyes, brown hair, and how fast you metabolize food 
are the same in the cells in your eyes and your liver, even though these 
organs function quite differently. If each cell has the same DNA, how is it 
that cells or organs are different? Why do cells in the eye differ so 
dramatically from cells in the liver? 


Whereas each cell shares the same genome and DNA sequence, each cell 
does not turn on, or express, the same set of genes. Each cell type needs a 
different set of proteins to perform its function. Therefore, only a small 
subset of proteins is expressed in a cell. For the proteins to be expressed, 
the DNA must be transcribed into RNA and the RNA must be translated 
into protein. In a given cell type, not all genes encoded in the DNA are 
transcribed into RNA or translated into protein because specific cells in our 
body have specific functions. Specialized proteins that make up the eye 
(iris, lens, and cornea) are only expressed in the eye, whereas the 
specialized proteins in the heart (pacemaker cells, heart muscle, and valves) 


are only expressed in the heart. At any given time, only a subset of all of the 
genes encoded by our DNA are expressed and translated into proteins. The 
expression of specific genes is a highly regulated process with many levels 
and stages of control. This complexity ensures the proper expression in the 
proper cell at the proper time. 


Regulation of Gene Expression 
By the end of this section, you will be able to: 


e Discuss why every cell does not express all of its genes 

e Describe how prokaryotic gene regulation occurs at the transcriptional 
level 

e Discuss how eukaryotic gene regulation occurs at the epigenetic, 
transcriptional, post-transcriptional, translational, and post- 
translational levels 


For a cell to function properly, necessary proteins must be synthesized at 
the proper time. All cells control or regulate the synthesis of proteins from 
information encoded in their DNA. The process of turning on a gene to 
produce RNA and protein is called gene expression. Whether in a simple 
unicellular organism or a complex multi-cellular organism, each cell 
controls when and how its genes are expressed. For this to occur, there must 
be a mechanism to control when a gene is expressed to make RNA and 
protein, how much of the protein is made, and when it is time to stop 
making that protein because it is no longer needed. 


The regulation of gene expression conserves energy and space. It would 
require a significant amount of energy for an organism to express every 
gene at all times, so it is more energy efficient to turn on the genes only 
when they are required. In addition, only expressing a subset of genes in 
each cell saves space because DNA must be unwound from its tightly 
coiled structure to transcribe and translate the DNA. Cells would have to be 
enormous if every protein were expressed in every cell all the time. 


The control of gene expression is extremely complex. Malfunctions in this 
process are detrimental to the cell and can lead to the development of many 
diseases, including cancer. 


Prokaryotic versus Eukaryotic Gene Expression 


To understand how gene expression is regulated, we must first understand 
how a gene codes for a functional protein in a cell. The process occurs in 
both prokaryotic and eukaryotic cells, just in slightly different manners. 


Prokaryotic organisms are single-celled organisms that lack a cell nucleus, 
and their DNA therefore floats freely in the cell cytoplasm. To synthesize a 
protein, the processes of transcription and translation occur almost 
simultaneously. When the resulting protein is no longer needed, 
transcription stops. As a result, the primary method to control what type of 
protein and how much of each protein is expressed in a prokaryotic cell is 
the regulation of DNA transcription. All of the subsequent steps occur 
automatically. When more protein is required, more transcription occurs. 
Therefore, in prokaryotic cells, the control of gene expression is mostly at 
the transcriptional level. 


Eukaryotic cells, in contrast, have intracellular organelles that add to their 
complexity. In eukaryotic cells, the DNA is contained inside the cell’s 
nucleus and there it is transcribed into RNA. The newly synthesized RNA is 
then transported out of the nucleus into the cytoplasm, where ribosomes 
translate the RNA into protein. The processes of transcription and 
translation are physically separated by the nuclear membrane; transcription 
occurs only within the nucleus, and translation occurs only outside the 
nucleus in the cytoplasm. The regulation of gene expression can occur at all 
stages of the process ({link]). Regulation may occur when the DNA is 
uncoiled and loosened from nucleosomes to bind transcription factors 
(epigenetic level), when the RNA is transcribed (transcriptional level), 
when the RNA is processed and exported to the cytoplasm after it is 
transcribed (post-transcriptional level), when the RNA is translated into 
protein (translational level), or after the protein has been made (post- 
translational level). 
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(a) Prokaryote Ribosomes 


Prokaryotic transcription and translation occur simultaneously 
in the cytoplasm, and regulation occurs at the transcriptional 
level. Eukaryotic gene expression is regulated during 
transcription and RNA processing, which take place in the 
nucleus, and during protein translation, which takes place in 
the cytoplasm. Further regulation may occur through post- 
translational modifications of proteins. 


The differences in the regulation of gene expression between prokaryotes 
and eukaryotes are summarized in [link]. The regulation of gene expression 
is discussed in detail in subsequent modules. 


Differences in the Regulation of Gene Expression of Prokaryotic 
and Eukaryotic Organisms 


Prokaryotic 
organisms Eukaryotic organisms 


Lack nucleus Contain nucleus 


Differences in the Regulation of Gene Expression of Prokaryotic 
and Eukaryotic Organisms 


Prokaryotic 
organisms Eukaryotic organisms 


DNA is found in DNA is confined to the nuclear compartment 
the cytoplasm 

RNA transcription 
and protein 
formation occur 
almost 
simultaneously 


RNA transcription occurs prior to protein 
formation, and it takes place in the nucleus. 
Translation of RNA to protein occurs in the 
cytoplasm. 


Gene expression is 


regulated primarily Gene expression is regulated at many levels 


(epigenetic, transcriptional, nuclear 


h ae 
Seg bos shuttling, post-transcriptional, translational, 
transcriptional ‘ 

and post-translational) 
level 
Note: 


Evolution Connection 

Evolution of Gene Regulation 

Prokaryotic cells can only regulate gene expression by controlling the 
amount of transcription. As eukaryotic cells evolved, the complexity of the 
control of gene expression increased. For example, with the evolution of 
eukaryotic cells came compartmentalization of important cellular 
components and cellular processes. A nuclear region that contains the 
DNA was formed. Transcription and translation were physically separated 
into two different cellular compartments. It therefore became possible to 
control gene expression by regulating transcription in the nucleus, and also 
by controlling the RNA levels and protein translation present outside the 
nucleus. 


Some cellular processes arose from the need of the organism to defend 
itself. Cellular processes such as gene silencing developed to protect the 
cell from viral or parasitic infections. If the cell could quickly shut off gene 
expression for a short period of time, it would be able to survive an 
infection when other organisms could not. Therefore, the organism evolved 
a new process that helped it survive, and it was able to pass this new 
development to offspring. 


Section Summary 


While all somatic cells within an organism contain the same DNA, not all 
cells within that organism express the same proteins. Prokaryotic organisms 
express the entire DNA they encode in every cell, but not necessarily all at 
the same time. Proteins are expressed only when they are needed. 
Eukaryotic organisms express a subset of the DNA that is encoded in any 
given cell. In each cell type, the type and amount of protein is regulated by 
controlling gene expression. To express a protein, the DNA is first 
transcribed into RNA, which is then translated into proteins. In prokaryotic 
cells, these processes occur almost simultaneously. In eukaryotic cells, 
transcription occurs in the nucleus and is separate from the translation that 
occurs in the cytoplasm. Gene expression in prokaryotes is mostly regulated 
at the transcriptional level (some epigenetic and post-translational 
regulation is also present), whereas in eukaryotic cells, gene expression is 
regulated at the epigenetic, transcriptional, post-transcriptional, 
translational, and post-translational levels. 


Review Questions 


Exercise: 


Problem: 


Control of gene expression in eukaryotic cells occurs at which 
level(s)? 


a. only the transcriptional level 


b. epigenetic and transcriptional levels 

c. epigenetic, transcriptional, and translational levels 

d. epigenetic, transcriptional, post-transcriptional, translational, and 
post-translational levels 


Solution: 
D 
Exercise: 
Problem: Post-translational control refers to: 


a. regulation of gene expression after transcription 
b. regulation of gene expression after translation 
c. control of epigenetic activation 

d. period between transcription and translation 


Solution: 


B 


Free Response 


Exercise: 


Problem: 


Name two differences between prokaryotic and eukaryotic cells and 
how these differences benefit multicellular organisms. 


Solution: 


Eukaryotic cells have a nucleus, whereas prokaryotic cells do not. In 
eukaryotic cells, DNA is confined within the nuclear region. Because 
of this, transcription and translation are physically separated. This 


creates a more complex mechanism for the control of gene expression 
that benefits multicellular organisms because it compartmentalizes 
gene regulation. 


Gene expression occurs at many stages in eukaryotic cells, whereas in 
prokaryotic cells, control of gene expression only occurs at the 
transcriptional level. This allows for greater control of gene expression 
in eukaryotes and more complex systems to be developed. Because of 
this, different cell types can arise in an individual organism. 


Exercise: 


Problem: 


Describe how controlling gene expression will alter the overall protein 
levels in the cell. 


Solution: 


The cell controls which proteins are expressed and to what level each 
protein is expressed in the cell. Prokaryotic cells alter the transcription 
rate to turn genes on or off. This method will increase or decrease 
protein levels in response to what is needed by the cell. Eukaryotic 
cells change the accessibility (epigenetic), transcription, or translation 
of a gene. This will alter the amount of RNA and the lifespan of the 
RNA to alter the amount of protein that exists. Eukaryotic cells also 
control protein translation to increase or decrease the overall levels. 
Eukaryotic organisms are much more complex and can manipulate 
protein levels by changing many stages in the process. 


Glossary 


epigenetic 
heritable changes that do not involve changes in the DNA sequence 


gene expression 
processes that control the turning on or turning off of a gene 


post-transcriptional 
control of gene expression after the RNA molecule has been created 
but before it is translated into protein 


post-translational 
control of gene expression after a protein has been created 


Prokaryotic Gene Regulation 
By the end of this section, you will be able to: 


e Describe the steps involved in prokaryotic gene regulation 
e Explain the roles of activators, inducers, and repressors in gene 
regulation 


The DNA of prokaryotes is organized into a circular chromosome 
supercoiled in the nucleoid region of the cell cytoplasm. Proteins that are 
needed for a specific function, or that are involved in the same biochemical 
pathway, are encoded together in blocks called operons. For example, all of 
the genes needed to use lactose as an energy source are coded next to each 
other in the lactose (or lac) operon. 


In prokaryotic cells, there are three types of regulatory molecules that can 
affect the expression of operons: repressors, activators, and inducers. 
Repressors are proteins that suppress transcription of a gene in response to 
an external stimulus, whereas activators are proteins that increase the 
transcription of a gene in response to an external stimulus. Finally, inducers 
are small molecules that either activate or repress transcription depending 
on the needs of the cell and the availability of substrate. 


The trp Operon: A Repressor Operon 


Bacteria such as E. coli need amino acids to survive. Tryptophan is one 
such amino acid that E. coli can ingest from the environment. E. coli can 
also synthesize tryptophan using enzymes that are encoded by five genes. 
These five genes are next to each other in what is called the tryptophan 
(trp) operon ({link]). If tryptophan is present in the environment, then E. 
coli does not need to synthesize it and the switch controlling the activation 
of the genes in the trp operon is switched off. However, when tryptophan 
availability is low, the switch controlling the operon is turned on, 
transcription is initiated, the genes are expressed, and tryptophan is 
synthesized. 


When tryptophan is present, the trp repressor binds 
the operator, and RNA synthesis is blocked. 


RNA Polymerase Tryptophan 


In the absence of tryptophan, the repressor dissociates 
from the operator, and RNA synthesis proceeds. 


Operator trpD trpC trpA 
| Operator | pe | woo | mpc | ope | tp | 


RNA Polymerase =——> 


The five genes that are needed to synthesize 
tryptophan in E. coli are located next to each other 
in the trp operon. When tryptophan is plentiful, 
two tryptophan molecules bind the repressor 
protein at the operator sequence. This physically 
blocks the RNA polymerase from transcribing the 
tryptophan genes. When tryptophan is absent, the 
repressor protein does not bind to the operator and 
the genes are transcribed. 


A DNA sequence that codes for proteins is referred to as the coding region. 
The five coding regions for the tryptophan biosynthesis enzymes are 
arranged sequentially on the chromosome in the operon. Just before the 
coding region is the transcriptional start site. This is the region of DNA to 
which RNA polymerase binds to initiate transcription. The promoter 
sequence is upstream of the transcriptional start site; each operon has a 
sequence within or near the promoter to which proteins (activators or 
repressors) can bind and regulate transcription. 


A DNA sequence called the operator sequence is encoded between the 
promoter region and the first trp coding gene. This operator contains the 


DNA code to which the repressor protein can bind. When tryptophan is 
present in the cell, two tryptophan molecules bind to the trp repressor, 
which changes shape to bind to the trp operator. Binding of the tryptophan— 
repressor complex at the operator physically prevents the RNA polymerase 
from binding, and transcribing the downstream genes. 


When tryptophan is not present in the cell, the repressor by itself does not 
bind to the operator; therefore, the operon is active and tryptophan is 
synthesized. Because the repressor protein actively binds to the operator to 
keep the genes turned off, the trp operon is negatively regulated and the 
proteins that bind to the operator to silence trp expression are negative 
regulators. 


Note: 
Link to Learning 
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Watch this video to learn more about the trp operon. 
https://www.openstaxcollege.org/I/trp_operon 


Catabolite Activator Protein (CAP): An Activator Regulator 


Just as the trp operon is negatively regulated by tryptophan molecules, there 
are proteins that bind to the operator sequences that act as a positive 
regulator to tum genes on and activate them. For example, when glucose is 
scarce, E. coli bacteria can turn to other sugar sources for fuel. To do this, 
new genes to process these alternate genes must be transcribed. When 
glucose levels drop, cyclic AMP (cAMP) begins to accumulate in the cell. 


The cAMP molecule is a signaling molecule that is involved in glucose and 
energy metabolism in E. coli. When glucose levels decline in the cell, 
accumulating cAMP binds to the positive regulator catabolite activator 
protein (CAP), a protein that binds to the promoters of operons that control 
the processing of alternative sugars. When cAMP binds to CAP, the 
complex binds to the promoter region of the genes that are needed to use 
the alternate sugar sources ([link]). In these operons, a CAP binding site is 
located upstream of the RNA polymerase binding site in the promoter. This 
increases the binding ability of RNA polymerase to the promoter region and 
the transcription of the genes. 


In the absence of CAMP, CAP does 
not bind the promoter. Transcription 
occurs at a low rate. 


lacZ lacY 


RNA Polymerase 


In the presence of cAMP, CAP binds 
the promoter and increases RNA 
polymerase activity. 


RNA Polymerase 


When glucose levels fall, E. coli may use other 
sugars for fuel but must transcribe new genes to do 
so. As glucose supplies become limited, cAMP 
levels increase. This cAMP binds to the CAP 
protein, a positive regulator that binds to an 
operator region upstream of the genes required to 
use other sugar sources. 


The lac Operon: An Inducer Operon 


The third type of gene regulation in prokaryotic cells occurs through 
inducible operons, which have proteins that bind to activate or repress 
transcription depending on the local environment and the needs of the cell. 
The lac operon is a typical inducible operon. As mentioned previously, E. 
coli is able to use other sugars as energy sources when glucose 
concentrations are low. To do so, the cAMP—CAP protein complex serves 
as a positive regulator to induce transcription. One such sugar source is 
lactose. The lac operon encodes the genes necessary to acquire and process 
the lactose from the local environment. CAP binds to the operator sequence 
upstream of the promoter that initiates transcription of the lac operon. 
However, for the lac operon to be activated, two conditions must be met. 
First, the level of glucose must be very low or non-existent. Second, lactose 
must be present. Only when glucose is absent and lactose is present will the 
lac operon be transcribed ({link]). This makes sense for the cell, because it 
would be energetically wasteful to create the proteins to process lactose if 
glucose was plentiful or lactose was not available. 


Note: 
Art Connection 


In the absence of lactose, the lac repressor 
binds the operator, and transcription is 


blocked. 
Operator | lacZ 


RNA Polymerase>> Repressor 


In the presence of lactose, the lac repressor 
is released from the operator, and 
transcription proceeds at a slow rate. 


RNA Polymerase —= p> 
Repressor 


("Lactose 


cAMP-CAP complex stimulates RNA 
Polymerase activity and increases RNA 
synthesis. 


RNA Polymerase ==> 


However, even in the presence of 
CAMP-CAP complex, RNA synthesis is 
blocked when repressor is bound to 
the operator. 


Operator | lacZ 


RNA Polymerase>é> Repressor 


Transcription of the lac operon 
is carefully regulated so that its 
expression only occurs when 
glucose is limited and lactose is 
present to serve as an 
alternative fuel source. 


In E. coli, the trp operon is on by default, while the lac operon is off. Why 
do you think this is the case? 


If glucose is absent, then CAP can bind to the operator sequence to activate 
transcription. If lactose is absent, then the repressor binds to the operator to 
prevent transcription. If either of these requirements is met, then 
transcription remains off. Only when both conditions are satisfied is the lac 
operon transcribed ([link]). 


Signals that Induce or Repress Transcription of the lac Operon 


CAP Repressor 
Glucose binds Lactose binds Transcription 
4 Z 2 + No 
+ - e - Some 
- + - - No 
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Note: 
Link to Learning 
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Watch an animated tutorial about the workings of lac operon here. 
https://www.openstaxcollege.org/I/lac_operon 


Section Summary 


The regulation of gene expression in prokaryotic cells occurs at the 
transcriptional level. There are three ways to control the transcription of an 
operon: repressive control, activator control, and inducible control. 
Repressive control, typified by the trp operon, uses proteins bound to the 
operator sequence to physically prevent the binding of RNA polymerase 
and the activation of transcription. Therefore, if tryptophan is not needed, 
the repressor is bound to the operator and transcription remains off. 
Activator control, typified by the action of CAP, increases the binding 
ability of RNA polymerase to the promoter when CAP is bound. In this 
case, low levels of glucose result in the binding of cAMP to CAP. CAP then 
binds the promoter, which allows RNA polymerase to bind to the promoter 
better. In the last example—the lac operon—two conditions must be met to 
initiate transcription. Glucose must not be present, and lactose must be 
available for the lac operon to be transcribed. If glucose is absent, CAP 
binds to the operator. If lactose is present, the repressor protein does not 
bind to its operator. Only when both conditions are met will RNA 
polymerase bind to the promoter to induce transcription. 


Art Connections 


Exercise: 


Problem: 


[link] In E. coli, the trp operon is on by default, while the lac operon is 
off. Why do you think that this is the case? 


Solution: 


[link] Tryptophan is an amino acid essential for making proteins, so 
the cell always needs to have some on hand. However, if plenty of 
tryptophan is present, it is wasteful to make more, and the expression 
of the trp receptor is repressed. Lactose, a sugar found in milk, is not 
always available. It makes no sense to make the enzymes necessary to 
digest an energy source that is not available, so the lac operon is only 
turned on when lactose is present. 


Review Questions 


Exercise: 


Problem: 
If glucose is absent, but so is lactose, the lac operon will be 


a. activated 

b. repressed 

c. activated, but only partially 
d. mutated 


Solution: 


B 
Exercise: 


Problem: 


Prokaryotic cells lack a nucleus. Therefore, the genes in prokaryotic 
cells are: 


a. all expressed, all of the time 

b. transcribed and translated almost simultaneously 

c. transcriptionally controlled because translation begins before 
transcription ends 

d. b and c are both true 


Solution: 


D 


Free Response 


Exercise: 
Problem: 


Describe how transcription in prokaryotic cells can be altered by 
external stimulation such as excess lactose in the environment. 


Solution: 


Environmental stimuli can increase or induce transcription in 
prokaryotic cells. In this example, lactose in the environment will 
induce the transcription of the lac operon, but only if glucose is not 
available in the environment. 


Exercise: 


Problem: 


What is the difference between a repressible and an inducible operon? 


Solution: 


A repressible operon uses a protein bound to the promoter region of a 
gene to keep the gene repressed or silent. This repressor must be 
actively removed in order to transcribe the gene. An inducible operon 
is either activated or repressed depending on the needs of the cell and 
what is available in the local environment. 


Glossary 


activator 
protein that binds to prokaryotic operators to increase transcription 


catabolite activator protein (CAP) 
protein that complexes with cAMP to bind to the promoter sequences 
of operons that control sugar processing when glucose is not available 


inducible operon 


operon that can be activated or repressed depending on cellular needs 
and the surrounding environment 


lac operon 
operon in prokaryotic cells that encodes genes required for processing 
and intake of lactose 


negative regulator 
protein that prevents transcription 


operator 
region of DNA outside of the promoter region that binds activators or 
repressors that control gene expression in prokaryotic cells 


operon 
collection of genes involved in a pathway that are transcribed together 
as a Single mRNA in prokaryotic cells 


positive regulator 
protein that increases transcription 


repressor 
protein that binds to the operator of prokaryotic genes to prevent 
transcription 


transcriptional start site 
site at which transcription begins 


trp operon 
series of genes necessary to synthesize tryptophan in prokaryotic cells 


tryptophan 
amino acid that can be synthesized by prokaryotic cells when 
necessary 


Eukaryotic Epigenetic Gene Regulation 
By the end of this section, you will be able to: 


e Explain the process of epigenetic regulation 
¢ Describe how access to DNA is controlled by histone modification 


Eukaryotic gene expression is more complex than prokaryotic gene 
expression because the processes of transcription and translation are 
physically separated. Unlike prokaryotic cells, eukaryotic cells can regulate 
gene expression at many different levels. Eukaryotic gene expression begins 
with control of access to the DNA. This form of regulation, called 
epigenetic regulation, occurs even before transcription is initiated. 


Epigenetic Control: Regulating Access to Genes within the 
Chromosome 


The human genome encodes over 20,000 genes; each of the 23 pairs of 
human chromosomes encodes thousands of genes. The DNA in the nucleus 
is precisely wound, folded, and compacted into chromosomes so that it will 
fit into the nucleus. It is also organized so that specific segments can be 
accessed as needed by a specific cell type. 


The first level of organization, or packing, is the winding of DNA strands 
around histone proteins. Histones package and order DNA into structural 
units called nucleosome complexes, which can control the access of 
proteins to the DNA regions ({link]a). Under the electron microscope, this 
winding of DNA around histone proteins to form nucleosomes looks like 
small beads on a string ([link]b). These beads (histone proteins) can move 
along the string (DNA) and change the structure of the molecule. 


Histone 


Nucleosome 


(a) (b) 


DNA is folded around histone proteins to create (a) 
nucleosome complexes. These nucleosomes control the access 
of proteins to the underlying DNA. When viewed through an 
electron microscope (b), the nucleosomes look like beads on a 
string. (credit “micrograph”: modification of work by Chris 
Woodcock) 


If DNA encoding a specific gene is to be transcribed into RNA, the 
nucleosomes surrounding that region of DNA can slide down the DNA to 
open that specific chromosomal region and allow for the transcriptional 
machinery (RNA polymerase) to initiate transcription ([link]). Nucleosomes 
can move to open the chromosome structure to expose a segment of DNA, 
but do so in a very controlled manner. 


Note: 
Art Connection 


Histone tail 


Methyl group 


DNA inaccessible, gene inactive 


Histone tail 


Acetyl group 


DNA accessible, gene active 


Methylation of DNA and 
histones causes nucleosomes 
to pack tightly together. 
Transcription factors cannot 
bind the DNA, and genes are 
not expressed. 


Histone acetylation results 

in loose packing of nucleo- 
somes. Transcription factors 
can bind the DNA and genes 
are expressed. 


Nucleosomes can slide along DNA. When 
nucleosomes are spaced closely together (top), 
transcription factors cannot bind and gene 
expression is turned off. When the nucleosomes 
are spaced far apart (bottom), the DNA is exposed. 
Transcription factors can bind, allowing gene 
expression to occur. Modifications to the histones 
and DNA affect nucleosome spacing. 


In females, one of the two X chromosomes is inactivated during embryonic 
development because of epigenetic changes to the chromatin. What impact 
do you think these changes would have on nucleosome packing? 


How the histone proteins move is dependent on signals found on both the 
histone proteins and on the DNA. These signals are tags added to histone 
proteins and DNA that tell the histones if a chromosomal region should be 
open or closed ([link] depicts modifications to histone proteins and DNA). 
These tags are not permanent, but may be added or removed as needed. 


They are chemical modifications (phosphate, methyl, or acetyl groups) that 
are attached to specific amino acids in the protein or to the nucleotides of 
the DNA. The tags do not alter the DNA base sequence, but they do alter 
how tightly wound the DNA is around the histone proteins. DNA is a 
negatively charged molecule; therefore, changes in the charge of the histone 
will change how tightly wound the DNA molecule will be. When 
unmodified, the histone proteins have a large positive charge; by adding 
chemical modifications like acetyl groups, the charge becomes less positive. 


The DNA molecule itself can also be modified. This occurs within very 
specific regions called CpG islands. These are stretches with a high 
frequency of cytosine and guanine dinucleotide DNA pairs (CG) found in 
the promoter regions of genes. When this configuration exists, the cytosine 
member of the pair can be methylated (a methyl group is added). This 
modification changes how the DNA interacts with proteins, including the 
histone proteins that control access to the region. Highly methylated 
(hypermethylated) DNA regions with deacetylated histones are tightly 
coiled and transcriptionally inactive. 


EPIGENETIC CHANGES TO THE EPIGENETIC CHANGES 
CHROMATIN MAY RESULT FROM MAY RESULT IN 

+ Development (in utero, childhood) + Cancer 

* Environmental chemicals + Autoimmune disease 
+ Drugs/Pharmaceuticals + Mental disorders 

+ Aging + Diabetes 


+ Diet 
CHROMATIN @ *cey group 


Pies Oe Q METHYL GROUP } 
/ 


‘DNA methylation and chemical 
Histones are proteins around modification 

which DNA winds for compaction ‘spa 
and gene regulation. 


Histone proteins and DNA nucleotides can be modified 


chemically. Modifications affect nucleosome spacing and 
gene expression. (credit: modification of work by NIH) 


This type of gene regulation is called epigenetic regulation. Epigenetic 
means “around genetics.” The changes that occur to the histone proteins 
and DNA do not alter the nucleotide sequence and are not permanent. 
Instead, these changes are temporary (although they often persist through 
multiple rounds of cell division) and alter the chromosomal structure (open 
or closed) as needed. A gene can be turned on or off depending upon the 
location and modifications to the histone proteins and DNA. If a gene is to 
be transcribed, the histone proteins and DNA are modified surrounding the 
chromosomal region encoding that gene. This opens the chromosomal 
region to allow access for RNA polymerase and other proteins, called 
transcription factors, to bind to the promoter region, located just upstream 
of the gene, and initiate transcription. If a gene is to remain turned off, or 
silenced, the histone proteins and DNA have different modifications that 
signal a closed chromosomal configuration. In this closed configuration, the 
RNA polymerase and transcription factors do not have access to the DNA 
and transcription cannot occur ([link]). 


Note: 
Link to Learning 
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View this video that describes how epigenetic regulation controls gene 
expression. 
https://www.openstaxcollege.org/l/epigenetic_reg 


Section Summary 


In eukaryotic cells, the first stage of gene expression control occurs at the 
epigenetic level. Epigenetic mechanisms control access to the chromosomal 
region to allow genes to be turned on or off. These mechanisms control how 
DNA is packed into the nucleus by regulating how tightly the DNA is 
wound around histone proteins. The addition or removal of chemical 
modifications (or flags) to histone proteins or DNA signals to the cell to 
open or close a chromosomal region. Therefore, eukaryotic cells can control 
whether a gene is expressed by controlling accessibility to transcription 
factors and the binding of RNA polymerase to initiate transcription. 


Art Connections 


Exercise: 


Problem: 


[link] In females, one of the two X chromosomes is inactivated during 
embryonic development because of epigenetic changes to the 
chromatin. What impact do you think these changes would have on 
nucleosome packing? 


Solution: 


[link] The nucleosomes would pack more tightly together. 


Review Questions 


Exercise: 


Problem: What are epigenetic modifications? 


a. the addition of reversible changes to histone proteins and DNA 
b. the removal of nucleosomes from the DNA 

c. the addition of more nucleosomes to the DNA 

d. mutation of the DNA sequence 


Solution: 
A 
Exercise: 
Problem: Which of the following are true of epigenetic changes? 


a. allow DNA to be transcribed 

b. move histones to open or close a chromosomal region 
c. are temporary 

d. all of the above 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


In cancer cells, alteration to epigenetic modifications turns off genes 
that are normally expressed. Hypothetically, how could you reverse 
this process to turn these genes back on? 


Solution: 


You can create medications that reverse the epigenetic processes (to 
add histone acetylation marks or to rmove DNA methylation) and 
create an open chromosomal configuration. 


Glossary 


transcription factor 
protein that binds to the DNA at the promoter or enhancer region and 
that influences transcription of a gene 


Eukaryotic Transcription Gene Regulation 
By the end of this section, you will be able to: 


e Discuss the role of transcription factors in gene regulation 
e Explain how enhancers and repressors regulate gene expression 


Like prokaryotic cells, the transcription of genes in eukaryotes requires the 
actions of an RNA polymerase to bind to a sequence upstream of a gene to 
initiate transcription. However, unlike prokaryotic cells, the eukaryotic 
RNA polymerase requires other proteins, or transcription factors, to 
facilitate transcription initiation. Transcription factors are proteins that bind 
to the promoter sequence and other regulatory sequences to control the 
transcription of the target gene. RNA polymerase by itself cannot initiate 
transcription in eukaryotic cells. Transcription factors must bind to the 
promoter region first and recruit RNA polymerase to the site for 
transcription to be established. 


Note: 
Link to Learning 
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sr 
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View the process of transcription—the making of RNA from a DNA 
template. 
https://www.openstaxcollege.org/I/transcript RNA 


The Promoter and the Transcription Machinery 


Genes are organized to make the control of gene expression easier. The 
promoter region is immediately upstream of the coding sequence. This 


region can be short (only a few nucleotides in length) or quite long 
(hundreds of nucleotides long). The longer the promoter, the more available 
space for proteins to bind. This also adds more control to the transcription 
process. The length of the promoter is gene-specific and can differ 
dramatically between genes. Consequently, the level of control of gene 
expression can also differ quite dramatically between genes. The purpose of 
the promoter is to bind transcription factors that control the initiation of 
transcription. 


Within the promoter region, just upstream of the transcriptional start site, 
resides the TATA box. This box is simply a repeat of thymine and adenine 
dinucleotides (literally, TATA repeats). RNA polymerase binds to the 
transcription initiation complex, allowing transcription to occur. To initiate 
transcription, a transcription factor (TFIID) is the first to bind to the TATA 
box. Binding of TFIID recruits other transcription factors, including TFIIB, 
TFIE, TFIIF, and TFIIH to the TATA box. Once this complex is assembled, 
RNA polymerase can bind to its upstream sequence. When bound along 
with the transcription factors, RNA polymerase is phosphorylated. This 
releases part of the protein from the DNA to activate the transcription 
initiation complex and places RNA polymerase in the correct orientation to 
begin transcription; DNA-bending protein brings the enhancer, which can 
be quite a distance from the gene, in contact with transcription factors and 
mediator proteins ((link]). 


DNA bending protein Enhancer 


Distal control 
elements 


Transcription 
factors and mediator 
proteins 


Activators 


RNA polymerase 


An enhancer is a DNA sequence 
that promotes transcription. Each 
enhancer is made up of short DNA 
sequences called distal control 
elements. Activators bound to the 
distal control elements interact 
with mediator proteins and 
transcription factors. Two different 
genes may have the same 
promoter but different distal 
control elements, enabling 
differential gene expression. 


In addition to the general transcription factors, other transcription factors 
can bind to the promoter to regulate gene transcription. These transcription 
factors bind to the promoters of a specific set of genes. They are not general 
transcription factors that bind to every promoter complex, but are recruited 
to a specific sequence on the promoter of a specific gene. There are 


hundreds of transcription factors in a cell that each bind specifically to a 
particular DNA sequence motif. When transcription factors bind to the 
promoter just upstream of the encoded gene, it is referred to as a cis-acting 
element, because it is on the same chromosome just next to the gene. The 
region that a particular transcription factor binds to is called the 
transcription factor binding site. Transcription factors respond to 
environmental stimuli that cause the proteins to find their binding sites and 
initiate transcription of the gene that is needed. 


Enhancers and Transcription 


In some eukaryotic genes, there are regions that help increase or enhance 
transcription. These regions, called enhancers, are not necessarily close to 
the genes they enhance. They can be located upstream of a gene, within the 
coding region of the gene, downstream of a gene, or may be thousands of 
nucleotides away. 


Enhancer regions are binding sequences, or sites, for transcription factors. 
When a DNA-bending protein binds, the shape of the DNA changes 
({link]). This shape change allows for the interaction of the activators bound 
to the enhancers with the transcription factors bound to the promoter region 
and the RNA polymerase. Whereas DNA is generally depicted as a straight 
line in two dimensions, it is actually a three-dimensional object. Therefore, 
a nucleotide sequence thousands of nucleotides away can fold over and 
interact with a specific promoter. 


Turning Genes Off: Transcriptional Repressors 


Like prokaryotic cells, eukaryotic cells also have mechanisms to prevent 
transcription. Transcriptional repressors can bind to promoter or enhancer 
regions and block transcription. Like the transcriptional activators, 
repressors respond to external stimuli to prevent the binding of activating 
transcription factors. 


Section Summary 


To start transcription, general transcription factors, such as TFIID, TFIIH, 
and others, must first bind to the TATA box and recruit RNA polymerase to 
that location. The binding of additional regulatory transcription factors to 
cis-acting elements will either increase or prevent transcription. In addition 
to promoter sequences, enhancer regions help augment transcription. 
Enhancers can be upstream, downstream, within a gene itself, or on other 
chromosomes. Transcription factors bind to enhancer regions to increase or 
prevent transcription. 


Review Questions 


Exercise: 


Problem: 
The binding of is required for transcription to start. 


a. a protein 

b. DNA polymerase 

c. RNA polymerase 

d. a transcription factor 


Solution: 


C 
Exercise: 
Problem: 


What will result from the binding of a transcription factor to an 
enhancer region? 


a. decreased transcription of an adjacent gene 

b. increased transcription of a distant gene 

c. alteration of the translation of an adjacent gene 
d. initiation of the recruitment of RNA polymerase 


Solution: 


B 


Free Response 


Exercise: 
Problem: 


A mutation within the promoter region can alter transcription of a 
gene. Describe how this can happen. 


Solution: 


A mutation in the promoter region can change the binding site for a 
transcription factor that normally binds to increase transcription. The 
mutation could either decrease the ability of the transcription factor to 
bind, thereby decreasing transcription, or it can increase the ability of 
the transcription factor to bind, thus increasing transcription. 


Exercise: 


Problem: 


What could happen if a cell had too much of an activating transcription 
factor present? 


Solution: 

If too much of an activating transcription factor were present, then 
transcription would be increased in the cell. This could lead to 
dramatic alterations in cell function. 


Glossary 


cis-acting element 


transcription factor binding sites within the promoter that regulate the 
transcription of a gene adjacent to it 


enhancer 
segment of DNA that is upstream, downstream, perhaps thousands of 
nucleotides away, or on another chromosome that influence the 
transcription of a specific gene 


trans-acting element 
transcription factor binding site found outside the promoter or on 
another chromosome that influences the transcription of a particular 
gene 


transcription factor binding site 
sequence of DNA to which a transcription factor binds 


Eukaryotic Post-transcriptional Gene Regulation 
By the end of this section, you will be able to: 


e Understand RNA splicing and explain its role in regulating gene 
expression 
¢ Describe the importance of RNA stability in gene regulation 


RNA is transcribed, but must be processed into a mature form before 
translation can begin. This processing after an RNA molecule has been 
transcribed, but before it is translated into a protein, is called post- 
transcriptional modification. As with the epigenetic and transcriptional 
stages of processing, this post-transcriptional step can also be regulated to 
control gene expression in the cell. If the RNA is not processed, shuttled, or 
translated, then no protein will be synthesized. 


RNA splicing, the first stage of post-transcriptional control 


In eukaryotic cells, the RNA transcript often contains regions, called 
introns, that are removed prior to translation. The regions of RNA that code 
for protein are called exons ([link]). After an RNA molecule has been 
transcribed, but prior to its departure from the nucleus to be translated, the 
RNA is processed and the introns are removed by splicing. 


YN 


Pre-mRNA can be alternatively spliced to 
create different proteins. 


Note: 

Evolution Connection 

Alternative RNA Splicing 

In the 1970s, genes were first observed that exhibited alternative RNA 
splicing. Alternative RNA splicing is a mechanism that allows different 
protein products to be produced from one gene when different 
combinations of introns, and sometimes exons, are removed from the 
transcript ([link]). This alternative splicing can be haphazard, but more 
often it is controlled and acts as a mechanism of gene regulation, with the 
frequency of different splicing alternatives controlled by the cell as a way 
to control the production of different protein products in different cells or 
at different stages of development. Alternative splicing is now understood 
to be acommon mechanism of gene regulation in eukaryotes; according to 
one estimate, 70 percent of genes in humans are expressed as multiple 
proteins through alternative splicing. 


= 


Exon skipping 


Mutually exclusive exons 
Alternative 5' donor sites 


Alternative 3’ acceptor sites 


Intron retention 


There are five basic modes of alternative 
splicing. 


How could alternative splicing evolve? Introns have a beginning and 
ending recognition sequence; it is easy to imagine the failure of the 
splicing mechanism to identify the end of an intron and instead find the 
end of the next intron, thus removing two introns and the intervening exon. 
In fact, there are mechanisms in place to prevent such intron skipping, but 
mutations are likely to lead to their failure. Such “mistakes” would more 
than likely produce a nonfunctional protein. Indeed, the cause of many 
genetic diseases is alternative splicing rather than mutations in a sequence. 
However, alternative splicing would create a protein variant without the 
loss of the original protein, opening up possibilities for adaptation of the 
new variant to new functions. Gene duplication has played an important 
role in the evolution of new functions in a similar way by providing genes 
that may evolve without eliminating the original, functional protein. 


Note: 
Link to Learning 
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Visualize how mRNA splicing happens by watching the process in action 
in this video. 
https://www.openstaxcollege.org/I/mRNA_ splicing 


Control of RNA Stability 


Before the mRNA leaves the nucleus, it is given two protective "caps" that 
prevent the end of the strand from degrading during its journey. The 5" cap, 
which is placed on the 5' end of the mRNA, is usually composed of a 
methylated guanosine triphosphate molecule (GTP). The poly-A tail, which 


is attached to the 3' end, is usually composed of a series of adenine 
nucleotides. Once the RNA is transported to the cytoplasm, the length of 
time that the RNA resides there can be controlled. Each RNA molecule has 
a defined lifespan and decays at a specific rate. This rate of decay can 
influence how much protein is in the cell. If the decay rate is increased, the 
RNA will not exist in the cytoplasm as long, shortening the time for 
translation to occur. Conversely, if the rate of decay is decreased, the RNA 
molecule will reside in the cytoplasm longer and more protein can be 
translated. This rate of decay is referred to as the RNA stability. If the RNA 
is stable, it will be detected for longer periods of time in the cytoplasm. 


Binding of proteins to the RNA can influence its stability. Proteins, called 
RNA-binding proteins, or RBPs, can bind to the regions of the RNA just 
upstream or downstream of the protein-coding region. These regions in the 
RNA that are not translated into protein are called the untranslated 
regions, or UTRs. They are not introns (those have been removed in the 
nucleus). Rather, these are regions that regulate mRNA localization, 
stability, and protein translation. The region just before the protein-coding 
region is called the 5' UTR, whereas the region after the coding region is 
called the 3' UTR ([link]). The binding of RBPs to these regions can 
increase or decrease the stability of an RNA molecule, depending on the 
specific RBP that binds. 


RNA-binding proteins 


ope mm iin 


5' cap poly-A tail 


The protein-coding region of mRNA is flanked by 
5' and 3' untranslated regions (UTRs). The 
presence of RNA-binding proteins at the 5' or 3' 
UTR influences the stability of the RNA molecule. 


RNA Stability and microRNAs 


In addition to RBPs that bind to and control (increase or decrease) RNA 
stability, other elements called microRNAs can bind to the RNA molecule. 
These microRNAs, or miRNAs, are short RNA molecules that are only 21— 
24 nucleotides in length. The miRNAs are made in the nucleus as longer 
pre-miRNAs. These pre-miRNAs are chopped into mature miRNAs by a 
protein called dicer. Like transcription factors and RBPs, mature miRNAs 
recognize a specific sequence and bind to the RNA; however, miRNAs also 
associate with a ribonucleoprotein complex called the RNA-induced 
silencing complex (RISC). RISC binds along with the miRNA to degrade 
the target MRNA. Together, miRNAs and the RISC complex rapidly destroy 
the RNA molecule. 


Section Summary 


Post-transcriptional control can occur at any stage after transcription, 
including RNA splicing, nuclear shuttling, and RNA stability. Once RNA is 
transcribed, it must be processed to create a mature RNA that is ready to be 
translated. This involves the removal of introns that do not code for protein. 
Spliceosomes bind to the signals that mark the exon/intron border to 
remove the introns and ligate the exons together. Once this occurs, the RNA 
is mature and can be translated. RNA is created and spliced in the nucleus, 
but needs to be transported to the cytoplasm to be translated. RNA is 
transported to the cytoplasm through the nuclear pore complex. Once the 
RNA is in the cytoplasm, the length of time it resides there before being 
degraded, called RNA stability, can also be altered to control the overall 
amount of protein that is synthesized. The RNA stability can be increased, 
leading to longer residency time in the cytoplasm, or decreased, leading to 
shortened time and less protein synthesis. RNA stability is controlled by 
RNA-binding proteins (RPBs) and microRNAs (miRNAs). These RPBs and 
miRNAs bind to the 5' UTR or the 3' UTR of the RNA to increase or 
decrease RNA stability. Depending on the RBP, the stability can be 
increased or decreased significantly; however, miRNAs always decrease 
stability and promote decay. 


Review Questions 


Exercise: 


Problem: 
Which of the following are involved in post-transcriptional control? 


a. control of RNA splicing 

b. control of RNA shuttling 
c. control of RNA stability 

d. all of the above 


Solution: 


D 
Exercise: 


Problem: 


Binding of an RNA binding protein will the stability of the 
RNA molecule. 


a. increase 

b. decrease 

c. neither increase nor decrease 
d. either increase or decrease 


Solution: 


D 


Free Response 


Exercise: 


Problem: 


Describe how RBPs can prevent miRNAs from degrading an RNA 
molecule. 


Solution: 


RNA binding proteins (RBP) bind to the RNA and can either increase 
or decrease the stability of the RNA. If they increase the stability of 
the RNA molecule, the RNA will remain intact in the cell for a longer 
period of time than normal. Since both RBPs and miRNAs bind to the 
RNA molecule, RBP can potentially bind first to the RNA and prevent 
the binding of the miRNA that will degrade it. 


Exercise: 


Problem: 


How can external stimuli alter post-transcriptional control of gene 
expression? 


Solution: 


External stimuli can modify RNA-binding proteins (i.e., through 
phosphorylation of proteins) to alter their activity. 


Glossary 


3'UTR 
3' untranslated region; region just downstream of the protein-coding 
region in an RNA molecule that is not translated 


9' cap 
a methylated guanosine triphosphate (GTP) molecule that is attached 
to the 5' end of a messenger RNA to protect the end from degradation 


3’ UTR 


5' untranslated region; region just upstream of the protein-coding 
region in an RNA molecule that is not translated 


dicer 
enzyme that chops the pre-miRNA into the mature form of the miRNA 


microRNA (miRNA) 
small RNA molecules (approximately 21 nucleotides in length) that 
bind to RNA molecules to degrade them 


poly-A tail 
a series of adenine nucleotides that are attached to the 3' end of an 
mRNA to protect the end from degradation 


RNA-binding protein (RBP) 
protein that binds to the 3' or 5' UTR to increase or decrease the RNA 
stability 


RNA stability 
how long an RNA molecule will remain intact in the cytoplasm 


untranslated region 
segment of the RNA molecule that are not translated into protein. 
These regions lie before (upstream or 5') and after (downstream or 3') 
the protein-coding region 


RISC 
protein complex that binds along with the miRNA to the RNA to 
degrade it 


Eukaryotic Translational and Post-translational Gene Regulation 
By the end of this section, you will be able to: 


e Understand the process of translation and discuss its key factors 

¢ Describe how the initiation complex controls translation 

e Explain the different ways in which the post-translational control of 
gene expression takes place 


After the RNA has been transported to the cytoplasm, it is translated into 
protein. Control of this process is largely dependent on the RNA molecule. 
As previously discussed, the stability of the RNA will have a large impact 
on its translation into a protein. As the stability changes, the amount of time 
that it is available for translation also changes. 


The Initiation Complex and Translation Rate 


Like transcription, translation is controlled by proteins that bind and initiate 
the process. In translation, the complex that assembles to start the process is 
referred to as the initiation complex. The first protein to bind to the RNA 
to initiate translation is the eukaryotic initiation factor-2 (eIF-2). The elF- 
2 protein is active when it binds to the high-energy molecule guanosine 
triphosphate (GTP). GTP provides the energy to start the reaction by 
giving up a phosphate and becoming guanosine diphosphate (GDP). The 
elF-2 protein bound to GTP binds to the small 40S ribosomal subunit. 
When bound, the methionine initiator tRNA associates with the eIF-2/40S 
ribosome complex, bringing along with it the mRNA to be translated. At 
this point, when the initiator complex is assembled, the GTP is converted 
into GDP and energy is released. The phosphate and the elF-2 protein are 
released from the complex and the large 60S ribosomal subunit binds to 
translate the RNA. The binding of eIF-2 to the RNA is controlled by 
phosphorylation. If e[F-2 is phosphorylated, it undergoes a conformational 
change and cannot bind to GTP. Therefore, the initiation complex cannot 
form properly and translation is impeded ({link]). When eIF-2 remains 
unphosphorylated, it binds the RNA and actively translates the protein. 


Note: 
Art Connection 


When elF2 is 
phosphorylated, 
translation is 
‘ blocked. 
Ribosome No 
small (40S) subunit Translation 


When elF2 is not 
phosphorylated, 


era translation 
2 occurs. ; 
Ribosome Translation 


small (40S) subunit occurs 


Gene expression can be controlled 
by factors that bind the translation 
initiation complex. 


An increase in phosphorylation levels of e[F-2 has been observed in 
patients with neurodegenerative diseases such as Alzheimer’s, Parkinson’s, 
and Huntington’s. What impact do you think this might have on protein 
synthesis? 


Chemical Modifications, Protein Activity, and Longevity 


Proteins can be chemically modified with the addition of groups including 
methyl, phosphate, acetyl, and ubiquitin groups. The addition or removal of 
these groups from proteins regulates their activity or the length of time they 
exist in the cell. Sometimes these modifications can regulate where a 
protein is found in the cell—for example, in the nucleus, the cytoplasm, or 
attached to the plasma membrane. 


Chemical modifications occur in response to external stimuli such as stress, 
the lack of nutrients, heat, or ultraviolet light exposure. These changes can 
alter epigenetic accessibility, transcription, MRNA stability, or translation— 
all resulting in changes in expression of various genes. This is an efficient 
way for the cell to rapidly change the levels of specific proteins in response 


to the environment. Because proteins are involved in every stage of gene 
regulation, the phosphorylation of a protein (depending on the protein that 
is modified) can alter accessibility to the chromosome, can alter translation 
(by altering transcription factor binding or function), can change nuclear 
shuttling (by influencing modifications to the nuclear pore complex), can 
alter RNA stability (by binding or not binding to the RNA to regulate its 
stability), can modify translation (increase or decrease), or can change post- 
translational modifications (add or remove phosphates or other chemical 
modifications). 


The addition of an ubiquitin group to a protein marks that protein for 
degradation. Ubiquitin acts like a flag indicating that the protein lifespan is 
complete. These proteins are moved to the proteasome, an organelle that 
functions to remove proteins, to be degraded ([link]). One way to control 
gene expression, therefore, is to alter the longevity of the protein. 


Ubiquitin 
a Proteasome 


Proteins with ubiquitin tags are 
marked for degradation within the 
proteasome. 


Section Summary 


Changing the status of the RNA or the protein itself can affect the amount 
of protein, the function of the protein, or how long it is found in the cell. To 
translate the protein, a protein initiator complex must assemble on the RNA. 


Modifications (such as phosphorylation) of proteins in this complex can 
prevent proper translation from occurring. Once a protein has been 
synthesized, it can be modified (phosphorylated, acetylated, methylated, or 
ubiquitinated). These post-translational modifications can greatly impact 
the stability, degradation, or function of the protein. 


Art Connections 


Exercise: 


Problem: 


[link] An increase in phosphorylation levels of eI[F-2 has been 
observed in patients with neurodegenerative diseases such as 
Alzheimer’s, Parkinson’s, and Huntington’s. What impact do you think 
this might have on protein synthesis? 


Solution: 


[link] Protein synthesis would be inhibited. 


Review Questions 


Exercise: 
Problem: 


Post-translational modifications of proteins can affect which of the 
following? 


a. protein function 

b. transcriptional regulation 
c. chromatin modification 
d. all of the above 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


Protein modification can alter gene expression in many ways. Describe 
how phosphorylation of proteins can alter gene expression. 


Solution: 


Because proteins are involved in every stage of gene regulation, 
phosphorylation of a protein (depending on the protein that is 
modified) can alter accessibility to the chromosome, can alter 
translation (by altering the transcription factor binding or function), 
can change nuclear shuttling (by influencing modifications to the 
nuclear pore complex), can alter RNA stability (by binding or not 
binding to the RNA to regulate its stability), can modify translation 
(increase or decrease), or can change post-translational modifications 
(add or remove phosphates or other chemical modifications). 


Exercise: 
Problem: 
Alternative forms of a protein can be beneficial or harmful to a cell. 


What do you think would happen if too much of an alternative protein 
bound to the 3' UTR of an RNA and caused it to degrade? 


Solution: 


If the RNA degraded, then less of the protein that the RNA encodes 
would be translated. This could have dramatic implications for the cell. 


Exercise: 


Problem: 


Changes in epigenetic modifications alter the accessibility and 
transcription of DNA. Describe how environmental stimuli, such as 
ultraviolet light exposure, could modify gene expression. 


Solution: 


Environmental stimuli, like ultraviolet light exposure, can alter the 
modifications to the histone proteins or DNA. Such stimuli may 
change an actively transcribed gene into a silenced gene by removing 
acetyl groups from histone proteins or by adding methyl groups to 
DNA. 


Glossary 


eukaryotic initiation factor-2 (eIF-2) 
protein that binds first to an MRNA to initiate translation 


guanine diphosphate (GDP) 
molecule that is left after the energy is used to start translation 


guanine triphosphate (GTP) 
energy-providing molecule that binds to eIF-2 and is needed for 
translation 


initiation complex 
protein complex containing eIF2-2 that starts translation 


large 60S ribosomal subunit 
second, larger ribosomal subunit that binds to the RNA to translate it 
into protein 


proteasome 
organelle that degrades proteins 


small 40S ribosomal subunit 


ribosomal subunit that binds to the RNA to translate it into protein 


Cancer and Gene Regulation 
By the end of this section, you will be able to: 


¢ Describe how changes to gene expression can cause cancer 

e Explain how changes to gene expression at different levels can disrupt 
the cell cycle 

e Discuss how understanding regulation of gene expression can lead to 
better drug design 


Cancer is not a single disease but includes many different diseases. In 
cancer cells, mutations modify cell-cycle control and cells don’t stop 
growing as they normally would. Mutations can also alter the growth rate or 
the progression of the cell through the cell cycle. One example of a gene 
modification that alters the growth rate is increased phosphorylation of 
cyclin B, a protein that controls the progression of a cell through the cell 
cycle and serves as a cell-cycle checkpoint protein. 


For cells to move through each phase of the cell cycle, the cell must pass 
through checkpoints. This ensures that the cell has properly completed the 
step and has not encountered any mutation that will alter its function. Many 
proteins, including cyclin B, control these checkpoints. The 
phosphorylation of cyclin B, a post-translational event, alters its function. 
As a result, cells can progress through the cell cycle unimpeded, even if 
mutations exist in the cell and its growth should be terminated. This post- 
translational change of cyclin B prevents it from controlling the cell cycle 
and contributes to the development of cancer. 


Cancer: Disease of Altered Gene Expression 


Cancer can be described as a disease of altered gene expression. There are 
many proteins that are turned on or off (gene activation or gene silencing) 
that dramatically alter the overall activity of the cell. A gene that is not 
normally expressed in that cell can be switched on and expressed at high 
levels. This can be the result of gene mutation or changes in gene regulation 
(epigenetic, transcription, post-transcription, translation, or post- 
translation). 


Changes in epigenetic regulation, transcription, RNA stability, protein 
translation, and post-translational control can be detected in cancer. While 
these changes don’t occur simultaneously in one cancer, changes at each of 
these levels can be detected when observing cancer at different sites in 
different individuals. Therefore, changes in histone acetylation (epigenetic 
modification that leads to gene silencing), activation of transcription factors 
by phosphorylation, increased RNA stability, increased translational 
control, and protein modification can all be detected at some point in 
various cancer cells. Scientists are working to understand the common 
changes that give rise to certain types of cancer or how a modification 
might be exploited to destroy a tumor cell. 


Tumor Suppressor Genes, Oncogenes, and Cancer 


In normal cells, some genes function to prevent excess, inappropriate cell 
growth. These are tumor suppressor genes, which are active in normal cells 
to prevent uncontrolled cell growth. There are many tumor suppressor 
genes in cells. The most studied tumor suppressor gene is p53, which is 
mutated in over 50 percent of all cancer types. The p53 protein itself 
functions as a transcription factor. It can bind to sites in the promoters of 
genes to initiate transcription. Therefore, the mutation of p53 in cancer will 
dramatically alter the transcriptional activity of its target genes. 


Note: 
Link to Learning 
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Watch this animation to learn more about the use of p53 in fighting cancer. 


Proto-oncogenes are positive cell-cycle regulators. When mutated, proto- 
oncogenes can become oncogenes and cause cancer. Overexpression of the 
oncogene can lead to uncontrolled cell growth. This is because oncogenes 
can alter transcriptional activity, stability, or protein translation of another 
gene that directly or indirectly controls cell growth. An example of an 
oncogene involved in cancer is a protein called myc. Myc is a transcription 
factor that is aberrantly activated in Burkett’s Lymphoma, a cancer of the 
lymph system. Overexpression of myc transforms normal B cells into 
cancerous cells that continue to grow uncontrollably. High B-cell numbers 
can result in tumors that can interfere with normal bodily function. Patients 
with Burkett’s lymphoma can develop tumors on their jaw or in their mouth 
that interfere with the ability to eat. 


Cancer and Epigenetic Alterations 


Silencing genes through epigenetic mechanisms is also very common in 
cancer cells. There are characteristic modifications to histone proteins and 
DNA that are associated with silenced genes. In cancer cells, the DNA in 
the promoter region of silenced genes is methylated on cytosine DNA 
residues in CpG islands. Histone proteins that surround that region lack the 
acetylation modification that is present when the genes are expressed in 
normal cells. This combination of DNA methylation and histone 
deacetylation (epigenetic modifications that lead to gene silencing) is 
commonly found in cancer. When these modifications occur, the gene 
present in that chromosomal region is silenced. Increasingly, scientists 
understand how epigenetic changes are altered in cancer. Because these 
changes are temporary and can be reversed—for example, by preventing the 
action of the histone deacetylase protein that removes acetyl groups, or by 
DNA methy] transferase enzymes that add methyl groups to cytosines in 
DNA—t is possible to design new drugs and new therapies to take 
advantage of the reversible nature of these processes. Indeed, many 
researchers are testing how a silenced gene can be switched back on in a 
cancer cell to help re-establish normal growth patterns. 


Genes involved in the development of many other illnesses, ranging from 
allergies to inflammation to autism, are thought to be regulated by 


epigenetic mechanisms. As our knowledge of how genes are controlled 
deepens, new ways to treat diseases like cancer will emerge. 


Cancer and Transcriptional Control 


Alterations in cells that give rise to cancer can affect the transcriptional 
control of gene expression. Mutations that activate transcription factors, 
such as increased phosphorylation, can increase the binding of a 
transcription factor to its binding site in a promoter. This could lead to 
increased transcriptional activation of that gene that results in modified cell 
growth. Alternatively, a mutation in the DNA of a promoter or enhancer 
region can increase the binding ability of a transcription factor. This could 
also lead to the increased transcription and aberrant gene expression that is 
seen in cancer cells. 


Researchers have been investigating how to control the transcriptional 
activation of gene expression in cancer. Identifying how a transcription 
factor binds, or a pathway that activates where a gene can be turned off, has 
led to new drugs and new ways to treat cancer. In breast cancer, for 
example, many proteins are overexpressed. This can lead to increased 
phosphorylation of key transcription factors that increase transcription. One 
such example is the overexpression of the epidermal growth factor receptor 
(EGFR) in a subset of breast cancers. The EGFR pathway activates many 
protein kinases that, in turn, activate many transcription factors that control 
genes involved in cell growth. New drugs that prevent the activation of 
EGER have been developed and are used to treat these cancers. 


Cancer and Post-transcriptional Control 


Changes in the post-transcriptional control of a gene can also result in 
cancer. Recently, several groups of researchers have shown that specific 
cancers have altered expression of miRNAs. Because miRNAs bind to the 
3' UTR of RNA molecules to degrade them, overexpression of these 
miRNAs could be detrimental to normal cellular activity. Too many 
miRNAs could dramatically decrease the RNA population leading to a 
decrease in protein expression. Several studies have demonstrated a change 
in the miRNA population in specific cancer types. It appears that the subset 


of miRNAs expressed in breast cancer cells is quite different from the 
subset expressed in lung cancer cells or even from normal breast cells. This 
suggests that alterations in miRNA activity can contribute to the growth of 
breast cancer cells. These types of studies also suggest that if some 
miRNAs are specifically expressed only in cancer cells, they could be 
potential drug targets. It would, therefore, be conceivable that new drugs 
that turn off miRNA expression in cancer could be an effective method to 
treat cancer. 


Cancer and Translational/Post-translational Control 


There are many examples of how translational or post-translational 
modifications of proteins arise in cancer. Modifications are found in cancer 
cells from the increased translation of a protein to changes in protein 
phosphorylation to alternative splice variants of a protein. An example of 
how the expression of an alternative form of a protein can have 
dramatically different outcomes is seen in colon cancer cells. The c-Flip 
protein, a protein involved in mediating the cell death pathway, comes in 
two forms: long (c-FLIPL) and short (c-FLIPS). Both forms appear to be 
involved in initiating controlled cell death mechanisms in normal cells. 
However, in colon cancer cells, expression of the long form results in 
increased cell growth instead of cell death. Clearly, the expression of the 
wrong protein dramatically alters cell function and contributes to the 
development of cancer. 


New Drugs to Combat Cancer: Targeted Therapies 


Scientists are using what is known about the regulation of gene expression 
in disease states, including cancer, to develop new ways to treat and prevent 
disease development. Many scientists are designing drugs on the basis of 
the gene expression patterns within individual tumors. This idea, that 
therapy and medicines can be tailored to an individual, has given rise to the 
field of personalized medicine. With an increased understanding of gene 
regulation and gene function, medicines can be designed to specifically 
target diseased cells without harming healthy cells. Some new medicines, 
called targeted therapies, have exploited the overexpression of a specific 


protein or the mutation of a gene to develop a new medication to treat 
disease. One such example is the use of anti-EGF receptor medications to 
treat the subset of breast cancer tumors that have very high levels of the 
EGF protein. Undoubtedly, more targeted therapies will be developed as 
scientists learn more about how gene expression changes can cause cancer. 


Note: 

Career Connection 

Clinical Trial Coordinator 

A clinical trial coordinator is the person managing the proceedings of the 
clinical trial. This job includes coordinating patient schedules and 
appointments, maintaining detailed notes, building the database to track 
patients (especially for long-term follow-up studies), ensuring proper 
documentation has been acquired and accepted, and working with the 
nurses and doctors to facilitate the trial and publication of the results. A 
clinical trial coordinator may have a science background, like a nursing 
degree, or other certification. People who have worked in science labs or in 
clinical offices are also qualified to become a clinical trial coordinator. 
These jobs are generally in hospitals; however, some clinics and doctor’s 
offices also conduct clinical trials and may hire a coordinator. 


Section Summary 


Cancer can be described as a disease of altered gene expression. Changes at 
every level of eukaryotic gene expression can be detected in some form of 
cancer at some point in time. In order to understand how changes to gene 
expression can cause cancer, it is critical to understand how each stage of 
gene regulation works in normal cells. By understanding the mechanisms of 
control in normal, non-diseased cells, it will be easier for scientists to 
understand what goes wrong in disease states including complex ones like 
cancer. 


Review Questions 


Exercise: 


Problem:Cancer causing genes are called 


a. transformation genes 
b. tumor suppressor genes 
c. oncogenes 

d. mutated genes 


Solution: 


C 
Exercise: 


Problem: 


Targeted therapies are used in patients with a set gene expression 
pattern. A targeted therapy that prevents the activation of the estrogen 
receptor in breast cancer would be beneficial to which type of patient? 


a. patients who express the EGFR receptor in normal cells 

b. patients with a mutation that inactivates the estrogen receptor 

c. patients with lots of the estrogen receptor expressed in their tumor 
d. patients that have no estrogen receptor expressed in their tumor 


Solution: 


C 


Free Response 


Exercise: 


Problem: 


New drugs are being developed that decrease DNA methylation and 
prevent the removal of acetyl groups from histone proteins. Explain 
how these drugs could affect gene expression to help kill tumor cells. 


Solution: 


These drugs will keep the histone proteins and the DNA methylation 
patterns in the open chromosomal configuration so that transcription is 
feasible. If a gene is silenced, these drugs could reverse the epigenetic 
configuration to re-express the gene. 


Exercise: 


Problem: 


How can understanding the gene expression pattern in a cancer cell tell 
you something about that specific form of cancer? 


Solution: 


Understanding which genes are expressed in a cancer cell can help 
diagnose the specific form of cancer. It can also help identify treatment 
options for that patient. For example, if a breast cancer tumor 
expresses the EGFR in high numbers, it might respond to specific anti- 
EGER therapy. If that receptor is not expressed, it would not respond 
to that therapy. 


Glossary 


DNA methylation 
epigenetic modification that leads to gene silencing; commonly found 
in cancer cells 


histone acetylation 
epigenetic modification that leads to gene silencing; commonly found 
in cancer cells 


myc 
oncogene that causes cancer in many cancer cells 


Introduction 
class="introduction' 
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After four 
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A human, as well as every sexually reproducing organism, begins life as a 
fertilized egg (embryo) or zygote. Trillions of cell divisions subsequently 
occur in a controlled manner to produce a complex, multicellular human. In 
other words, that original single cell is the ancestor of every other cell in the 
body. Once a being is fully grown, cell reproduction is still necessary to 
repair or regenerate tissues. For example, new blood and skin cells are 


constantly being produced. All multicellular organisms use cell division for 
growth and the maintenance and repair of cells and tissues. Cell division is 
tightly regulated, and the occasional failure of regulation can have life- 
threatening consequences. Single-celled organisms use cell division as their 
method of reproduction. 


Cell Division 
By the end of this section, you will be able to: 


e Describe the structure of prokaryotic and eukaryotic genomes 
e Distinguish between chromosomes, genes, and traits 
e Describe the mechanisms of chromosome compaction 


The continuity of life from one cell to another has its foundation in the 
reproduction of cells by way of the cell cycle. The cell cycle is an orderly 
sequence of events that describes the stages of a cell’s life from the division 
of a single parent cell to the production of two new daughter cells. The 
mechanisms involved in the cell cycle are highly regulated. 


Genomic DNA 


Before discussing the steps a cell must undertake to replicate, a deeper 
understanding of the structure and function of a cell’s genetic information is 
necessary. A cell’s DNA, packaged as a double-stranded DNA molecule, is 
called its genome. In prokaryotes, the genome is composed of a single, 
double-stranded DNA molecule in the form of a loop or circle ({link]). The 
region in the cell containing this genetic material is called a nucleoid. Some 
prokaryotes also have smaller loops of DNA called plasmids that are not 
essential for normal growth. Bacteria can exchange these plasmids with 
other bacteria, sometimes receiving beneficial new genes that the recipient 
can add to their chromosomal DNA. Antibiotic resistance is one trait that 
often spreads through a bacterial colony through plasmid exchange. 


Cell 
membrane 


Chromosome Nucleoid region 
(DNA) 


Prokaryotes, including bacteria 
and archaea, have a single, 
circular chromosome located in a 
central region called the nucleoid. 


In eukaryotes, the genome consists of several double-stranded linear DNA 
molecules ([{link]). Each species of eukaryotes has a characteristic number 
of chromosomes in the nuclei of its cells. Human body cells have 46 
chromosomes, while human gametes (sperm or eggs) have 23 
chromosomes each. A typical body cell, or somatic cell, contains two 
matched sets of chromosomes, a configuration known as diploid. The letter 
nis used to represent a single set of chromosomes; therefore, a diploid 
organism is designated 2n. Human cells that contain one set of 
chromosomes are called gametes, or sex cells; these are eggs and sperm, 
and are designated Jn, or haploid. 


There are 23 pairs of homologous 
chromosomes in a female human 
somatic cell. The condensed 
chromosomes are viewed within 
the nucleus (top), removed from a 
cell in mitosis and spread out on a 
Slide (right), and artificially 
arranged according to length (left); 
an arrangement like this is called a 
karyotype. In this image, the 
chromosomes were exposed to 
fluorescent stains for 
differentiation of the different 
chromosomes. A method of 
staining called “chromosome 
painting” employs fluorescent 
dyes that highlight chromosomes 
in different colors. (credit: 
National Human Genome 
Project/NIH) 


Matched pairs of chromosomes in a diploid organism are called 
homologous (“same knowledge”) chromosomes. Homologous 


chromosomes are the same length and have specific nucleotide segments 
called genes in exactly the same location, or locus. Genes, the functional 
units of chromosomes, determine specific characteristics by coding for 
specific proteins. Traits are the variations of those characteristics. For 
example, hair color is a characteristic with traits that are blonde, brown, or 
black. 


Each copy of a homologous pair of chromosomes originates from a 
different parent; therefore, the genes themselves are not identical. The 
variation of individuals within a species is due to the specific combination 
of the genes inherited from both parents. Even a slightly altered sequence of 
nucleotides within a gene can result in an alternative trait. For example, 
there are three possible gene sequences on the human chromosome that 
code for blood type: sequence A, sequence B, and sequence O. Because all 
diploid human cells have two copies of the chromosome that determines 
blood type, the blood type (the trait) is determined by which two versions of 
the marker gene are inherited. It is possible to have two copies of the same 
gene sequence on both homologous chromosomes, with one on each (for 
example, AA, BB, or OO), or two different sequences, such as AB. 


Minor variations of traits, such as blood type, eye color, and handedness, 
contribute to the natural variation found within a species. However, if the 
entire DNA sequence from any pair of human homologous chromosomes is 
compared, the difference is less than one percent. The sex chromosomes, X 
and Y, are the single exception to the rule of homologous chromosome 
uniformity: Other than a small amount of homology that is necessary to 
accurately produce gametes, the genes found on the X and Y chromosomes 
are different. 


Eukaryotic Chromosomal Structure and Compaction 


If the DNA from all 46 chromosomes in a human cell nucleus was laid out 
end to end, it would measure approximately two meters; however, its 
diameter would be only 2 nm. Considering that the size of a typical human 
cell is about 10 pm (100,000 cells lined up to equal one meter), DNA must 
be tightly packaged to fit in the cell’s nucleus. At the same time, it must 
also be readily accessible for the genes to be expressed. During some stages 


of the cell cycle, the long strands of DNA are condensed into compact 
chromosomes. There are a number of ways that chromosomes are 
compacted. 


In the first level of compaction, short stretches of the DNA double helix 
wrap around a core of eight histone proteins at regular intervals along the 
entire length of the chromosome ((link]). The DNA-histone complex is 
called chromatin. The beadlike, histone DNA complex is called a 
nucleosome, and DNA connecting the nucleosomes is called linker DNA. 
A DNA molecule in this form is about seven times shorter than the double 
helix without the histones, and the beads are about 10 nm in diameter, in 
contrast with the 2-nm diameter of a DNA double helix. The next level of 
compaction occurs as the nucleosomes and the linker DNA between them 
are coiled into a 30-nm chromatin fiber. This coiling further shortens the 
chromosome so that it is now about 50 times shorter than the extended 
form. In the third level of packing, a variety of fibrous proteins is used to 
pack the chromatin. These fibrous proteins also ensure that each 
chromosome in a non-dividing cell occupies a particular area of the nucleus 
that does not overlap with that of any other chromosome (see the top image 
in [link]). 


Organization of Eukaryotic Chromosomes 


DNA double 
helix 


DNA wrapped 
around histone 


Nucleosomes 
coiled into a 
chromatin 
fiber 


Further 
condensation 
of chromatin 


Duplicated 
chromosome 


Double-stranded DNA wraps 
around histone proteins to form 
nucleosomes that have the 
appearance of “beads on a string.” 
The nucleosomes are coiled into a 
30-nm chromatin fiber. When a 
cell undergoes mitosis, the 
chromosomes condense even 
further. 


DNA replicates in the S phase of interphase. After replication, the 
chromosomes are composed of two linked sister chromatids. When fully 


compact, the pairs of identically packed chromosomes are bound to each 
other by cohesin proteins. The connection between the sister chromatids is 
closest in a region called the centromere. The conjoined sister chromatids, 
with a diameter of about 1 jm, are visible under a light microscope. The 
centromeric region is highly condensed and thus will appear as a constricted 


ared. 


Note: 
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This animation illustrates the different levels of chromosome packing. 
https://www.openstaxcollege.org/l/Packaged DNA 


Section Summary 


Prokaryotes have a single circular chromosome composed of double- 
stranded DNA, whereas eukaryotes have multiple, linear chromosomes 
composed of chromatin surrounded by a nuclear membrane. The 46 
chromosomes of human somatic cells are composed of 22 pairs of 
autosomes (matched pairs) and a pair of sex chromosomes, which may or 
may not be matched. This is the 2n or diploid state. Human gametes have 
23 chromosomes or one complete set of chromosomes; a set of 
chromosomes is complete with either one of the sex chromosomes. This is 
the n or haploid state. Genes are segments of DNA that code for a specific 
protein. An organism’s traits are determined by the genes inherited from 
each parent. Duplicated chromosomes are composed of two sister 


chromatids. Chromosomes are compacted using a variety of mechanisms 
during certain stages of the cell cycle. Several classes of protein are 
involved in the organization and packing of the chromosomal DNA into a 
highly condensed structure. The condensing complex compacts 
chromosomes, and the resulting condensed structure is necessary for 
chromosomal segregation during mitosis. 


Review Questions 


Exercise: 
Problem: 


A diploid cell has the number of chromosomes as a haploid 
cell. 


a. one-fourth 
b. half 

c. twice 

d. four times 


Solution: 


G 
Exercise: 
Problem: 


An organism’s traits are determined by the specific combination of 
inherited 


a. cells. 

b. genes. 

c. proteins. 

d. chromatids. 


Solution: 


B 
Exercise: 


Problem: 


The first level of DNA organization in a eukaryotic cell is maintained 
by which molecule? 


a. cohesin 
b. condensin 
c. chromatin 
d. histone 


Solution: 


D 
Exercise: 


Problem: 


Identical copies of chromatin held together by cohesin at the 
centromere are called 


a. histones. 

b. nucleosomes. 

c. chromatin. 

d. sister chromatids. 


Solution: 


D 


Free Response 


Exercise: 


Problem: 
Compare and contrast a human somatic cell to a human gamete. 
Solution: 


Human somatic cells have 46 chromosomes: 22 pairs and 2 sex 
chromosomes that may or may not form a pair. This is the 2n or 
diploid condition. Human gametes have 23 chromosomes, one each of 
23 unique chromosomes, one of which is a sex chromosome. This is 
the n or haploid condition. 


Exercise: 


Problem: 
What is the relationship between a genome, chromosomes, and genes? 
Solution: 


The genome consists of the sum total of an organism’s chromosomes. 
Each chromosome contains hundreds and sometimes thousands of 
genes, segments of DNA that code for a polypeptide or RNA, and a 
large amount of DNA with no known function. 


Exercise: 
Problem: 


Eukaryotic chromosomes are thousands of times longer than a typical 
cell. Explain how chromosomes can fit inside a eukaryotic nucleus. 


Solution: 


The DNA double helix is wrapped around histone proteins to form 
structures called nucleosomes. Nucleosomes and the linker DNA in 
between them are coiled into a 30-nm fiber. During cell division, 
chromatin is further condensed by packing proteins. 


Glossary 


cell cycle 
ordered sequence of events that a cell passes through between one cell 
division and the next 


centromere 
region at which sister chromatids are bound together; a constricted 
area in condensed chromosomes 


chromatid 
single DNA molecule of two strands of duplicated DNA and 
associated proteins held together at the centromere 


diploid 
cell, nucleus, or organism containing two sets of chromosomes (2n) 


gamete 
haploid reproductive cell or sex cell (sperm, pollen grain, or egg) 


gene 
physical and functional unit of heredity, a sequence of DNA that codes 
for a protein. 


genome 
total genetic information of a cell or organism 


haploid 
cell, nucleus, or organism containing one set of chromosomes (n) 


histone 
one of several similar, highly conserved, low molecular weight, basic 
proteins found in the chromatin of all eukaryotic cells; associates with 
DNA to form nucleosomes 


homologous chromosomes 
chromosomes of the same morphology with genes in the same 
location; diploid organisms have pairs of homologous chromosomes 


(homologs), with each homolog derived from a different parent 


locus 
position of a gene on a chromosome 


nucleosome 
subunit of chromatin composed of a short length of DNA wrapped 
around a core of histone proteins 


The Cell Cycle 
By the end of this section, you will be able to: 


e Describe the three stages of interphase 

e Discuss the behavior of chromosomes during karyokinesis 

e Explain how the cytoplasmic content is divided during cytokinesis 
e Define the quiescent Gp phase 


The cell cycle is an ordered series of events involving cell growth and cell 
division that produces two new daughter cells. Cells on the path to cell 
division proceed through a series of precisely timed and carefully regulated 
stages of growth, DNA replication, and division that produces two identical 
(clone) cells. The cell cycle has two major phases: interphase and the 
mitotic phase ([link]). During interphase, the cell grows and DNA is 
replicated. During the mitotic phase, the replicated DNA and cytoplasmic 
contents are separated, and the cell divides. 


Mitotic phase 
Interphase 
itosi Formation 
be of 2 daughter 
cells 
Interphase Interphase 


The cell cycle consists of interphase and the mitotic phase. 
During interphase, the cell grows and the nuclear DNA is 
duplicated. Interphase is followed by the mitotic phase. During 


the mitotic phase, the duplicated chromosomes are segregated 
and distributed into daughter nuclei. The cytoplasm is usually 
divided as well, resulting in two daughter cells. 


Interphase 


During interphase, the cell undergoes normal growth processes while also 
preparing for cell division. In order for a cell to move from interphase into 
the mitotic phase, many internal and external conditions must be met. The 
three stages of interphase are called G,, S, and Gp. 


G, Phase (First Gap) 


The first stage of interphase is called the G, phase (first gap) because, from 
a microscopic aspect, little change is visible. However, during the G, stage, 
the cell is quite active at the biochemical level. The cell is accumulating the 
building blocks of chromosomal DNA and the associated proteins as well as 
accumulating sufficient energy reserves to complete the task of replicating 
each chromosome in the nucleus. 


S Phase (Synthesis of DNA) 


Throughout interphase, nuclear DNA remains in a semi-condensed 
chromatin configuration. In the S phase, DNA replication can proceed 
through the mechanisms that result in the formation of identical pairs of 
DNA molecules—sister chromatids—that are firmly attached to the 
centromeric region. The centrosome is duplicated during the S phase. The 
two centrosomes will give rise to the mitotic spindle, the apparatus that 
orchestrates the movement of chromosomes during mitosis. At the center of 
each animal cell, the centrosomes of animal cells are associated with a pair 
of rod-like objects, the centrioles, which are at right angles to each other. 


Centrioles help organize cell division. Centrioles are not present in the 
centrosomes of other eukaryotic species, such as plants and most fungi. 


G> Phase (Second Gap) 


In the G> phase, the cell replenishes its energy stores and synthesizes 
proteins necessary for chromosome manipulation. Some cell organelles are 
duplicated, and the cytoskeleton is dismantled to provide resources for the 
mitotic phase. There may be additional cell growth during Gp». The final 
preparations for the mitotic phase must be completed before the cell is able 
to enter the first stage of mitosis. 


The Mitotic Phase 


The mitotic phase is a multistep process during which the duplicated 
chromosomes are aligned, separated, and move into two new, identical 
daughter cells. The first portion of the mitotic phase is called karyokinesis, 
or nuclear division. The second portion of the mitotic phase, called 
cytokinesis, is the physical separation of the cytoplasmic components into 
the two daughter cells. 


Note: 
Link to Learning 
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Revisit the stages of mitosis at this site. 


Karyokinesis (Mitosis) 


Karyokinesis, also known as mitosis, is divided into a series of phases— 
prophase, prometaphase, metaphase, anaphase, and telophase—that result in 
the division of the cell nucleus ([{link]). Karyokinesis is also called mitosis. 


Note: 
Art Connection 


* Chromosomes 
condense and 
become visible 


* Spindle fibers 
emerge from the 
centrosomes 


* Nuclear envelope 
breaks down 


* Nucleolus 
disappears 


* Chromosomes 


continue to 
condense 


* Kinetochores 


appear at the 
centromeres 


* Mitotic spindle 


microtubules 
attach to 
kinetochores 


« Centrosomes 


move toward 
opposite poles 


Mitotic spindle is | * Cohesin proteins 
fully developed, binding the sister 
centrosomes are| chromatids 

at opposite poles} together break 
of the cell down 


Chromosomes 
are lined up at 
the metaphase 
plate 


* Sister chromatids 
(now called 
chromosomes) 
are pulled toward 
opposite poles 

Each sister 

chromatid is 

attached toa 
spindle fiber 
originating from 
opposite poles 


* Non-kinetochore 
spindle fibers 
lengthen, 
elongating 
the cell 


MITOSIS 


sone 


* Chromosomes 


arrive at opposite 
poles and begin 
to decondense 


* Nuclear envelope 


material 
surrounds 
each set of 
chromosomes 


* The mitotic 


spindle breaks 
down 


« Animal cells: a 
cleavage furrow 
separates the 
daughter cells 


¢ Plant cells: a cell 
plate separates 
the daughter 
cells 


Karyokinesis (or mitosis) is divided into five stages—prophase, 
prometaphase, metaphase, anaphase, and telophase. The 
pictures at the bottom were taken by fluorescence microscopy 
(hence, the black background) of cells artificially stained by 
fluorescent dyes: blue fluorescence indicates DNA 
(chromosomes) and green fluorescence indicates microtubules 
(spindle apparatus). (credit “mitosis drawings”: modification of 
work by Mariana Ruiz Villareal; credit “micrographs”: 


modification of work by Roy van Heesbeen; credit “cytokinesis 
micrograph”: Wadsworth Center/New York State Department 
of Health; scale-bar data from Matt Russell) 


Which of the following is the correct order of events in mitosis? 


a. Sister chromatids line up at the metaphase plate. The kinetochore 
becomes attached to the mitotic spindle. The nucleus reforms and the 
cell divides. Cohesin proteins break down and the sister chromatids 
separate. 

b. The kinetochore becomes attached to the mitotic spindle. Cohesin 
proteins break down and the sister chromatids separate. Sister 
chromatids line up at the metaphase plate. The nucleus reforms and 
the cell divides. 

c. The kinetochore becomes attached to the cohesin proteins. Sister 
chromatids line up at the metaphase plate. The kinetochore breaks 
down and the sister chromatids separate. The nucleus reforms and the 
cell divides. 

d. The kinetochore becomes attached to the mitotic spindle. Sister 
chromatids line up at the metaphase plate. Cohesin proteins break 
down and the sister chromatids separate. The nucleus reforms and the 
cell divides. 


During prophase, the “first phase,” the nuclear envelope starts to dissociate 
into small vesicles, and the membranous organelles (such as the Golgi 
complex or Golgi apparatus, and endoplasmic reticulum), fragment and 
disperse toward the periphery of the cell. The nucleolus disappears 
(disperses). The centrosomes begin to move to opposite poles of the cell. 
Microtubules that will form the mitotic spindle extend between the 
centrosomes, pushing them farther apart as the microtubule fibers lengthen. 
The sister chromatids begin to coil more tightly with the aid of condensin 
proteins and become visible under a light microscope. 


During prometaphase, the “first change phase,” many processes that were 
begun in prophase continue to advance. The remnants of the nuclear 
envelope fragment. The mitotic spindle continues to develop as more 
microtubules assemble and stretch across the length of the former nuclear 
area. Chromosomes become more condensed and discrete. Each sister 
chromatid develops a protein structure called a kinetochore in the 
centromeric region ([link]). The proteins of the kinetochore attract and bind 
mitotic spindle microtubules. As the spindle microtubules extend from the 
centrosomes, some of these microtubules come into contact with and firmly 
bind to the kinetochores. Once a mitotic fiber attaches to a chromosome, the 
chromosome will be oriented until the kinetochores of sister chromatids 
face the opposite poles. Eventually, all the sister chromatids will be attached 
via their kinetochores to microtubules from opposing poles. Spindle 
microtubules that do not engage the chromosomes are called polar 
microtubules. These microtubules overlap each other midway between the 
two poles and contribute to cell elongation. Astral microtubules are located 
near the poles, aid in spindle orientation, and are required for the regulation 
of mitosis. 


Mitotic spindle 
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During prometaphase, mitotic 
spindle microtubules from 
opposite poles attach to each sister 
chromatid at the kinetochore. In 
anaphase, the connection between 


the sister chromatids breaks down, 
and the microtubules pull the 
chromosomes toward opposite 
poles. 


During metaphase, the “change phase,” all the chromosomes are aligned in 
a plane called the metaphase plate, or the equatorial plane, midway 
between the two poles of the cell. The sister chromatids are still tightly 
attached to each other by cohesin proteins. At this time, the chromosomes 
are maximally condensed. 


During anaphase, the “upward phase,” the cohesin proteins degrade, and 
the sister chromatids separate at the centromere. Each chromatid, now 
called a chromosome, is pulled rapidly toward the centrosome to which its 
microtubule is attached. The cell becomes visibly elongated (oval shaped) 
as the polar microtubules slide against each other at the metaphase plate 
where they overlap. 


During telophase, the “distance phase,” the chromosomes reach the 
opposite poles and begin to decondense (unravel), relaxing into a chromatin 
configuration. The mitotic spindles are depolymerized into tubulin 
monomers that will be used to assemble cytoskeletal components for each 
daughter cell. Nuclear envelopes form around the chromosomes, and 
nucleosomes appear within the nuclear area. 


Cytokinesis 


Cytokinesis, or “cell motion,” is the second main stage of the mitotic phase 
during which cell division is completed via the physical separation of the 
cytoplasmic components into two daughter cells. Division is not complete 
until the cell components have been apportioned and completely separated 
into the two daughter cells. Although the stages of mitosis are similar for 
most eukaryotes, the process of cytokinesis is quite different for eukaryotes 
that have cell walls, such as plant cells. 


In cells such as animal cells that lack cell walls, cytokinesis follows the 
onset of anaphase. A contractile ring composed of actin filaments forms just 
inside the plasma membrane at the former metaphase plate. The actin 
filaments pull the equator of the cell inward, forming a fissure. This fissure, 
or “crack,” is called the cleavage furrow. The furrow deepens as the actin 
ring contracts, and eventually the membrane is cleaved in two ((link]). 


In plant cells, a new cell wall must form between the daughter cells. During 
interphase, the Golgi apparatus accumulates enzymes, structural proteins, 
and glucose molecules prior to breaking into vesicles and dispersing 
throughout the dividing cell. During telophase, these Golgi vesicles are 
transported on microtubules to form a phragmoplast (a vesicular structure) 
at the metaphase plate. There, the vesicles fuse and coalesce from the center 
toward the cell walls; this structure is called a cell plate. As more vesicles 
fuse, the cell plate enlarges until it merges with the cell walls at the 
periphery of the cell. Enzymes use the glucose that has accumulated 
between the membrane layers to build a new cell wall. The Golgi 
membranes become parts of the plasma membrane on either side of the new 
cell wall ([link]). 
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During cytokinesis in animal cells, 


a ring of actin filaments forms at 
the metaphase plate. The ring 
contracts, forming a cleavage 

furrow, which divides the cell in 

two. In plant cells, Golgi vesicles 
coalesce at the former metaphase 

plate, forming a phragmoplast. A 

cell plate formed by the fusion of 

the vesicles of the phragmoplast 
grows from the center toward the 
cell walls, and the membranes of 
the vesicles fuse to form a plasma 
membrane that divides the cell in 
two. 


G, Phase 


Not all cells adhere to the classic cell cycle pattern in which a newly formed 
daughter cell immediately enters the preparatory phases of interphase, 
closely followed by the mitotic phase. Cells in Gg phase are not actively 
preparing to divide. The cell is in a quiescent (inactive) stage that occurs 
when cells exit the cell cycle. Some cells enter Gp temporarily until an 
external signal triggers the onset of G;. Other cells that never or rarely 
divide, such as mature cardiac muscle and nerve cells, remain in Go 
permanently. 


Note: 

Scientific Method Connection 

Determine the Time Spent in Cell Cycle Stages 

Problem: How long does a cell spend in interphase compared to each stage 
of mitosis? 

Background: A prepared microscope slide of blastula cross-sections will 
show cells arrested in various stages of the cell cycle. It is not visually 


possible to separate the stages of interphase from each other, but the 
mitotic stages are readily identifiable. If 100 cells are examined, the 
number of cells in each identifiable cell cycle stage will give an estimate of 
the time it takes for the cell to complete that stage. 

Problem Statement: Given the events included in all of interphase and 
those that take place in each stage of mitosis, estimate the length of each 
stage based on a 24-hour cell cycle. Before proceeding, state your 
hypothesis. 

Test your hypothesis: Test your hypothesis by doing the following: 


1. Place a fixed and stained microscope slide of whitefish blastula cross- 
sections under the scanning objective of a light microscope. 

2. Locate and focus on one of the sections using the scanning objective 
of your microscope. Notice that the section is a circle composed of 
dozens of closely packed individual cells. 

3. Switch to the low-power objective and refocus. With this objective, 
individual cells are visible. 


4. Switch to the high-power objective and slowly move the slide left to 
right, and up and down to view all the cells in the section ({link]). As 
you scan, you will notice that most of the cells are not undergoing 
mitosis but are in the interphase period of the cell cycle. 


Scan the cells to identify the 
mitotic stage of the cells. 


Slowly scan whitefish blastula cells with the 
high-power objective as illustrated in image 
(a) to identify their mitotic stage. (b) A 
microscopic image of the scanned cells is 


shown. (credit “micrograph”: modification 
of work by Linda Flora; scale-bar data from 
Matt Russell) 


5. Practice identifying the various stages of the cell cycle, using the 
drawings of the stages as a guide ((link]). 

6. Once you are confident about your identification, begin to record the 
stage of each cell you encounter as you scan left to right, and top to 
bottom across the blastula section. 

7. Keep a tally of your observations and stop when you reach 100 cells 
identified. 

8. The larger the sample size (total number of cells counted), the more 
accurate the results. If possible, gather and record group data prior to 
calculating percentages and making estimates. 


Record your observations: Make a table similar to [link] in which you 
record your observations. 


Results of Cell Stage Identification 


Phase or Individual Group 

Stage Totals Totals Percent 
Interphase 

Prophase 

Metaphase 


Anaphase 


Results of Cell Stage Identification 


Phase or Individual Group 

Stage Totals Totals Percent 

Telophase 

Cytokinesis 

Totals 100 100 ae 
percent 


Analyze your data/report your results: To find the length of time 
whitefish blastula cells spend in each stage, multiply the percent (recorded 
as a decimal) by 24 hours. Make a table similar to [link] to illustrate your 
data. 


Estimate of Cell Stage Length 

Phase or Stage Percent (as Decimal) Time in Hours 
Interphase 

Prophase 

Metaphase 

Anaphase 


Telophase 


Estimate of Cell Stage Length 
Phase or Stage Percent (as Decimal) Time in Hours 


Cytokinesis 


Draw a conclusion: Did your results support your estimated times? Were 
any of the outcomes unexpected? If so, discuss which events in that stage 
might contribute to the calculated time. 


Section Summary 


The cell cycle is an orderly sequence of events. Cells on the path to cell 
division proceed through a series of precisely timed and carefully regulated 
stages. In eukaryotes, the cell cycle consists of a long preparatory period, 
called interphase. Interphase is divided into Gj, S, and G» phases. The 
mitotic phase begins with karyokinesis (mitosis), which consists of five 
stages: prophase, prometaphase, metaphase, anaphase, and telophase. The 
final stage of the mitotic phase is cytokinesis, during which the cytoplasmic 
components of the daughter cells are separated either by an actin ring 
(animal cells) or by cell plate formation (plant cells). 


Art Connections 


Exercise: 


Problem: 
[link] Which of the following is the correct order of events in mitosis? 


a. Sister chromatids line up at the metaphase plate. The kinetochore 
becomes attached to the mitotic spindle. The nucleus reforms and 
the cell divides. Cohesin proteins break down and the sister 
chromatids separate. 


b. The kinetochore becomes attached to the mitotic spindle. Cohesin 
proteins break down and the sister chromatids separate. Sister 
chromatids line up at the metaphase plate. The nucleus reforms 
and the cell divides. 

c. The kinetochore becomes attached to the cohesin proteins. Sister 
chromatids line up at the metaphase plate. The kinetochore breaks 
down and the sister chromatids separate. The nucleus reforms and 
the cell divides. 

d. The kinetochore becomes attached to the mitotic spindle. Sister 
chromatids line up at the metaphase plate. Cohesin proteins break 
down and the sister chromatids separate. The nucleus reforms and 
the cell divides. 


Solution: 


[link] D. The kinetochore becomes attached to the mitotic spindle. 
Sister chromatids line up at the metaphase plate. Cohesin proteins 
break down and the sister chromatids separate. The nucleus reforms 
and the cell divides. 


Review Questions 


Exercise: 


Problem: 
Chromosomes are duplicated during what stage of the cell cycle? 


a. G, phase 

b. S phase 

c. prophase 

d. prometaphase 


Solution: 


B 


Exercise: 


Problem: 


Which of the following events does not occur during some stages of 
interphase? 


a. DNA duplication 

b. organelle duplication 

c. increase in cell size 

d. separation of sister chromatids 


Solution: 


D 


Exercise: 


Problem:The mitotic spindles arise from which cell structure? 


a. centromere 
b. centrosome 
c. kinetochore 
d. cleavage furrow 


Solution: 


B 
Exercise: 


Problem: 


Attachment of the mitotic spindle fibers to the kinetochores is a 
characteristic of which stage of mitosis? 


a. prophase 
b. prometaphase 


c. metaphase 
d. anaphase 


Solution: 


B 
Exercise: 
Problem: 


Unpacking of chromosomes and the formation of a new nuclear 
envelope is a characteristic of which stage of mitosis? 


a. prometaphase 
b. metaphase 

c. anaphase 

d. telophase 


Solution: 


D 
Exercise: 


Problem: 


Separation of the sister chromatids is a characteristic of which stage of 
mitosis? 


a. prometaphase 
b. metaphase 

c. anaphase 

d. telophase 


Solution: 


C 


Exercise: 


Problem: 


The chromosomes become visible under a light microscope during 
which stage of mitosis? 


a. prophase 

b. prometaphase 
c. metaphase 

d. anaphase 


Solution: 


A 
Exercise: 


Problem: 


The fusing of Golgi vesicles at the metaphase plate of dividing plant 
cells forms what structure? 


a. cell plate 

b. actin ring 

c. cleavage furrow 
d. mitotic spindle 


Solution: 


A 


Free Response 


Exercise: 


Problem: 
Briefly describe the events that occur in each phase of interphase. 
Solution: 


During Gj, the cell increases in size, the genomic DNA is assessed for 
damage, and the cell stockpiles energy reserves and the components to 
synthesize DNA. During the S phase, the chromosomes, the 
centrosomes, and the centrioles (animal cells) duplicate. During the G5 
phase, the cell recovers from the S phase, continues to grow, duplicates 
some organelles, and dismantles other organelles. 


Exercise: 
Problem: 
Chemotherapy drugs such as vincristine and colchicine disrupt mitosis 
by binding to tubulin (the subunit of microtubules) and interfering with 
microtubule assembly and disassembly. Exactly what mitotic structure 


is targeted by these drugs and what effect would that have on cell 
division? 


Solution: 


The mitotic spindle is formed of microtubules. Microtubules are 
polymers of the protein tubulin; therefore, it is the mitotic spindle that 
is disrupted by these drugs. Without a functional mitotic spindle, the 
chromosomes will not be sorted or separated during mitosis. The cell 
will arrest in mitosis and die. 


Exercise: 


Problem: 


Describe the similarities and differences between the cytokinesis 
mechanisms found in animal cells versus those in plant cells. 


Solution: 


There are very few similarities between animal cell and plant cell 
cytokinesis. In animal cells, a ring of actin fibers is formed around the 
periphery of the cell at the former metaphase plate (cleavage furrow). 
The actin ring contracts inward, pulling the plasma membrane toward 
the center of the cell until the cell is pinched in two. In plant cells, a 
new cell wall must be formed between the daughter cells. Due to the 
rigid cell walls of the parent cell, contraction of the middle of the cell 
is not possible. Instead, a phragmoplast first forms. Subsequently, a 
cell plate is formed in the center of the cell at the former metaphase 
plate. The cell plate is formed from Golgi vesicles that contain 
enzymes, proteins, and glucose. The vesicles fuse and the enzymes 
build a new cell wall from the proteins and glucose. The cell plate 
grows toward and eventually fuses with the cell wall of the parent cell. 


Exercise: 
Problem: 


List some reasons why a cell that has just completed cytokinesis might 
enter the Gp phase instead of the G, phase. 


Solution: 


Many cells temporarily enter Go until they reach maturity. Some cells 
are only triggered to enter G; when the organism needs to increase that 
particular cell type. Some cells only reproduce following an injury to 
the tissue. Some cells never divide once they reach maturity. 


Exercise: 
Problem: 


What cell cycle events will be affected in a cell that produces mutated 
(non-functional) cohesin protein? 


Solution: 
If cohesin is not functional, chromosomes are not packaged after DNA 


replication in the S phase of interphase. It is likely that the proteins of 
the centromeric region, such as the kinetochore, would not form. Even 


if the mitotic spindle fibers could attach to the chromatids without 
packing, the chromosomes would not be sorted or separated during 
mitosis. 


Glossary 


anaphase 
stage of mitosis during which sister chromatids are separated from 
each other 


cell cycle 
ordered series of events involving cell growth and cell division that 
produces two new daughter cells 


cell plate 
structure formed during plant cell cytokinesis by Golgi vesicles, 
forming a temporary structure (phragmoplast) and fusing at the 
metaphase plate; ultimately leads to the formation of cell walls that 
separate the two daughter cells 


centriole 
rod-like structure constructed of microtubules at the center of each 
animal cell centrosome 


cleavage furrow 
constriction formed by an actin ring during cytokinesis in animal cells 
that leads to cytoplasmic division 


condensin 
proteins that help sister chromatids coil during prophase 


cytokinesis 
division of the cytoplasm following mitosis that forms two daughter 


cells. 


Go phase 


distinct from the G, phase of interphase; a cell in Gg is not preparing to 
divide 


G, phase 
(also, first gap) first phase of interphase centered on cell growth during 
mitosis 


G, phase 
(also, second gap) third phase of interphase during which the cell 
undergoes final preparations for mitosis 


interphase 
period of the cell cycle leading up to mitosis; includes G,, S, and G» 
phases (the interim period between two consecutive cell divisions 


karyokinesis 
mitotic nuclear division 


kinetochore 
protein structure associated with the centromere of each sister 
chromatid that attracts and binds spindle microtubules during 
prometaphase 


metaphase plate 
equatorial plane midway between the two poles of a cell where the 
chromosomes align during metaphase 


metaphase 
stage of mitosis during which chromosomes are aligned at the 
metaphase plate 


mitosis 
(also, karyokinesis) period of the cell cycle during which the 
duplicated chromosomes are separated into identical nuclei; includes 
prophase, prometaphase, metaphase, anaphase, and telophase 


mitotic phase 


period of the cell cycle during which duplicated chromosomes are 
distributed into two nuclei and cytoplasmic contents are divided; 
includes karyokinesis (mitosis) and cytokinesis 


mitotic spindle 
apparatus composed of microtubules that orchestrates the movement of 
chromosomes during mitosis 


prometaphase 
stage of mitosis during which the nuclear membrane breaks down and 
mitotic spindle fibers attach to kinetochores 


prophase 
stage of mitosis during which chromosomes condense and the mitotic 
spindle begins to form 


quiescent 
refers to a cell that is performing normal cell functions and has not 
initiated preparations for cell division 


S phase 
second, or synthesis, stage of interphase during which DNA replication 
occurs 


telophase 
stage of mitosis during which chromosomes arrive at opposite poles, 
decondense, and are surrounded by a new nuclear envelope 


Control of the Cell Cycle 
By the end of this section, you will be able to: 


e Understand how the cell cycle is controlled by mechanisms both 
internal and external to the cell 

e Explain how the three internal control checkpoints occur at the end of 
Gj, at the G)/M transition, and during metaphase 

e Describe the molecules that control the cell cycle through positive and 
negative regulation 


The length of the cell cycle is highly variable, even within the cells of a 
single organism. In humans, the frequency of cell turnover ranges from a 
few hours in early embryonic development, to an average of two to five 
days for epithelial cells, and to an entire human lifetime spent in Gp by 
specialized cells, such as cortical neurons or cardiac muscle cells. There is 
also variation in the time that a cell spends in each phase of the cell cycle. 
When fast-dividing mammalian cells are grown in culture (outside the body 
under optimal growing conditions), the length of the cycle is about 24 
hours. In rapidly dividing human cells with a 24-hour cell cycle, the Gj, 
phase lasts approximately nine hours, the S phase lasts 10 hours, the Gp 
phase lasts about four and one-half hours, and the M phase lasts 
approximately one-half hour. In early embryos of fruit flies, the cell cycle is 
completed in about eight minutes. The timing of events in the cell cycle is 
controlled by mechanisms that are both internal and external to the cell. 


Regulation of the Cell Cycle by External Events 


Both the initiation and inhibition of cell division are triggered by events 
external to the cell when it is about to begin the replication process. An 
event may be as simple as the death of a nearby cell or as sweeping as the 
release of growth-promoting hormones, such as human growth hormone 
(HGH). A lack of HGH can inhibit cell division, resulting in dwarfism, 
whereas too much HGH can result in gigantism. Crowding of cells can also 
inhibit cell division. Another factor that can initiate cell division is the size 
of the cell; as a cell grows, it becomes inefficient due to its decreasing 
surface-to-volume ratio. The solution to this problem is to divide. 


Whatever the source of the message, the cell receives the signal, and a 
series of events within the cell allows it to proceed into interphase. Moving 
forward from this initiation point, every parameter required during each cell 
cycle phase must be met or the cycle cannot progress. 


Regulation at Internal Checkpoints 


It is essential that the daughter cells produced be exact duplicates of the 
parent cell. Mistakes in the duplication or distribution of the chromosomes 
lead to mutations that may be passed forward to every new cell produced 
from an abnormal cell. To prevent a compromised cell from continuing to 
divide, there are internal control mechanisms that operate at three main cell 
cycle checkpoints. A checkpoint is one of several points in the eukaryotic 
cell cycle at which the progression of a cell to the next stage in the cycle 
can be halted until conditions are favorable. These checkpoints occur near 
the end of Gj, at the G»/M transition, and during metaphase ((Link])). 
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The cell cycle is controlled at three checkpoints. The integrity 
of the DNA is assessed at the G, checkpoint. Proper 
chromosome duplication is assessed at the Gy checkpoint. 
Attachment of each kinetochore to a spindle fiber is assessed at 
the M checkpoint. 


The G, Checkpoint 


The G, checkpoint determines whether all conditions are favorable for cell 
division to proceed. The G, checkpoint, also called the restriction point (in 
yeast), is a point at which the cell irreversibly commits to the cell division 
process. External influences, such as growth factors, play a large role in 
carrying the cell past the G; checkpoint. In addition to adequate reserves 
and cell size, there is a check for genomic DNA damage at the G; 
checkpoint. A cell that does not meet all the requirements will not be 
allowed to progress into the S phase. The cell can halt the cycle and attempt 
to remedy the problematic condition, or the cell can advance into Gp and 
await further signals when conditions improve. 


The G» Checkpoint 


The G» checkpoint bars entry into the mitotic phase if certain conditions are 
not met. As at the G,; checkpoint, cell size and protein reserves are assessed. 
However, the most important role of the Gy checkpoint is to ensure that all 
of the chromosomes have been replicated and that the replicated DNA is 
not damaged. If the checkpoint mechanisms detect problems with the DNA, 
the cell cycle is halted, and the cell attempts to either complete DNA 
replication or repair the damaged DNA. 


The M Checkpoint 


The M checkpoint occurs near the end of the metaphase stage of 
karyokinesis. The M checkpoint is also known as the spindle checkpoint, 
because it determines whether all the sister chromatids are correctly 
attached to the spindle microtubules. Because the separation of the sister 
chromatids during anaphase is an irreversible step, the cycle will not 
proceed until the kinetochores of each pair of sister chromatids are firmly 
anchored to at least two spindle fibers arising from opposite poles of the 
cell. 


Note: 
Link to Learning 
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Watch what occurs at the G;, Gy, and M checkpoints by visiting this 
website to see an animation of the cell cycle. 


Regulator Molecules of the Cell Cycle 


In addition to the internally controlled checkpoints, there are two groups of 
intracellular molecules that regulate the cell cycle. These regulatory 
molecules either promote progress of the cell to the next phase (positive 
regulation) or halt the cycle (negative regulation). Regulator molecules may 
act individually, or they can influence the activity or production of other 
regulatory proteins. Therefore, the failure of a single regulator may have 
almost no effect on the cell cycle, especially if more than one mechanism 
controls the same event. Conversely, the effect of a deficient or non- 
functioning regulator can be wide-ranging and possibly fatal to the cell if 
multiple processes are affected. 


Positive Regulation of the Cell Cycle 


Two groups of proteins, called cyclins and cyclin-dependent kinases 
(Cdks), are responsible for the progress of the cell through the various 
checkpoints. The levels of the four cyclin proteins fluctuate throughout the 
cell cycle in a predictable pattern ([{link]). Increases in the concentration of 
cyclin proteins are triggered by both external and internal signals. After the 
cell moves to the next stage of the cell cycle, the cyclins that were active in 
the previous stage are degraded. 
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The concentrations of cyclin proteins change 
throughout the cell cycle. There is a direct 
correlation between cyclin accumulation and 
the three major cell cycle checkpoints. Also 
note the sharp decline of cyclin levels 
following each checkpoint (the transition 
between phases of the cell cycle), as cyclin 
is degraded by cytoplasmic enzymes. 
(credit: modification of work by 
"WikiMiMa"/Wikimedia Commons) 


Cyclins regulate the cell cycle only when they are tightly bound to Cdks. To 
be fully active, the Cdk/cyclin complex must also be phosphorylated in 
specific locations. Like all kinases, Cdks are enzymes (kinases) that 
phosphorylate other proteins. Phosphorylation activates the protein by 
changing its shape. The proteins phosphorylated by Cdks are involved in 
advancing the cell to the next phase. ([link]). The levels of Cdk proteins are 
relatively stable throughout the cell cycle; however, the concentrations of 
cyclin fluctuate and determine when Cdk/cyclin complexes form. The 
different cyclins and Cdks bind at specific points in the cell cycle and thus 
regulate different checkpoints. 


Cyclin-dependent Kinases 
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Cyclin-dependent kinases (Cdks) are 
protein kinases that, when fully 
activated, can phosphorylate and thus 
activate other proteins that advance 
the cell cycle past a checkpoint. To 
become fully activated, a Cdk must 
bind to a cyclin protein and then be 
phosphorylated by another kinase. 


Since the cyclic fluctuations of cyclin levels are based on the timing of the 
cell cycle and not on specific events, regulation of the cell cycle usually 
occurs by either the Cdk molecules alone or the Cdk/cyclin complexes. 
Without a specific concentration of fully activated cyclin/Cdk complexes, 
the cell cycle cannot proceed through the checkpoints. 


Although the cyclins are the main regulatory molecules that determine the 
forward momentum of the cell cycle, there are several other mechanisms 
that fine-tune the progress of the cycle with negative, rather than positive, 
effects. These mechanisms essentially block the progression of the cell 
cycle until problematic conditions are resolved. Molecules that prevent the 
full activation of Cdks are called Cdk inhibitors. Many of these inhibitor 
molecules directly or indirectly monitor a particular cell cycle event. The 
block placed on Cdks by inhibitor molecules will not be removed until the 
specific event that the inhibitor monitors is completed. 


Negative Regulation of the Cell Cycle 


The second group of cell cycle regulatory molecules are negative 
regulators. Negative regulators halt the cell cycle. Remember that in 
positive regulation, active molecules cause the cycle to progress. 


The best understood negative regulatory molecules are retinoblastoma 
protein (Rb), p53, and p21. Retinoblastoma proteins are a group of tumor- 
suppressor proteins common in many cells. The 53 and 21 designations 
refer to the functional molecular masses of the proteins (p) in kilodaltons. 
Much of what is known about cell cycle regulation comes from research 
conducted with cells that have lost regulatory control. All three of these 
regulatory proteins were discovered to be damaged or non-functional in 
cells that had begun to replicate uncontrollably (became cancerous). In each 
case, the main cause of the unchecked progress through the cell cycle was a 
faulty copy of the regulatory protein. 


Rb, p53, and p21 act primarily at the G,; checkpoint. p53 is a multi- 
functional protein that has a major impact on the commitment of a cell to 
division because it acts when there is damaged DNA in cells that are 
undergoing the preparatory processes during G,. If damaged DNA is 
detected, p53 halts the cell cycle and recruits enzymes to repair the DNA. If 
the DNA cannot be repaired, p53 can trigger apoptosis, or cell suicide, to 
prevent the duplication of damaged chromosomes. As p53 levels rise, the 
production of p21 is triggered. p21 enforces the halt in the cycle dictated by 
p53 by binding to and inhibiting the activity of the Cdk/cyclin complexes. 
As a cell is exposed to more stress, higher levels of p53 and p21 
accumulate, making it less likely that the cell will move into the S phase. 


Rb exerts its regulatory influence on other positive regulator proteins. 
Chiefly, Rb monitors cell size. In the active, dephosphorylated state, Rb 
binds to proteins called transcription factors, most commonly, E2F ({link])). 
Transcription factors “turn on” specific genes, allowing the production of 
proteins encoded by that gene. When Rb is bound to E2F, production of 
proteins necessary for the G,/S transition is blocked. As the cell increases in 
size, Rb is slowly phosphorylated until it becomes inactivated. Rb releases 
E2F, which can now turn on the gene that produces the transition protein, 
and this particular block is removed. For the cell to move past each of the 
checkpoints, all positive regulators must be “turned on,” and all negative 
regulators must be “turned off.” 


Note: 
Art Connection 


Rb Regulation of the Cell 


Unphosphorylated Rb binds Cell growth triggers the 

transcription factor E2F. phosphoryation of Rb. 

E2F cannot bind the DNA, and Phosphorylated Rb releases 

transcription is blocked. E2F, which binds the DNA and 
turns on gene expression, thus 
advancing the cell cycle. 


Rb halts the cell cycle and releases its hold in response to 
cell growth. 


Rb and other proteins that negatively regulate the cell cycle are sometimes 
called tumor suppressors. Why do you think the name tumor suppressor 
might be appropriate for these proteins? 


Section Summary 


Each step of the cell cycle is monitored by internal controls called 
checkpoints. There are three major checkpoints in the cell cycle: one near 
the end of Gj, a second at the G>/M transition, and the third during 
metaphase. Positive regulator molecules allow the cell cycle to advance to 
the next stage. Negative regulator molecules monitor cellular conditions 
and can halt the cycle until specific requirements are met. 


Art Connections 


Exercise: 


Problem: 


[link] Rb and other proteins that negatively regulate the cell cycle are 
sometimes called tumor suppressors. Why do you think the name 
tumor suppressor might be an appropriate for these proteins? 


Solution: 


[link] Rb and other negative regulatory proteins control cell division 
and therefore prevent the formation of tumors. Mutations that prevent 
these proteins from carrying out their function can result in cancer. 


Review Questions 


Exercise: 


Problem: 


At which of the cell cycle checkpoints do external forces have the 
greatest influence? 


a. G; checkpoint 
b. Gp checkpoint 
c. M checkpoint 
d. Gg checkpoint 


Solution: 


A 
Exercise: 


Problem: 
What is the main prerequisite for clearance at the Gy checkpoint? 


a. cell has reached a sufficient size 


b. an adequate stockpile of nucleotides 
c. accurate and complete DNA replication 
d. proper attachment of mitotic spindle fibers to kinetochores 


Solution: 


C 
Exercise: 
Problem: 


If the M checkpoint is not cleared, what stage of mitosis will be 
blocked? 


a. prophase 

b. prometaphase 
c. metaphase 

d. anaphase 


Solution: 


D 
Exercise: 
Problem: 


Which protein is a positive regulator that phosphorylates other proteins 
when activated? 


a. po3 

b. retinoblastoma protein (Rb) 

c. cyclin 

d. cyclin-dependent kinase (Cdk) 


Solution: 


D 
Exercise: 


Problem: 


Many of the negative regulator proteins of the cell cycle were 
discovered in what type of cells? 


a. gametes 

b. cells in Go 
c. cancer cells 
d. stem cells 


Solution: 


C 
Exercise: 


Problem: 


Which negative regulatory molecule can trigger cell suicide 
(apoptosis) if vital cell cycle events do not occur? 


a Poo 

b. p21 

c. retinoblastoma protein (Rb) 

d. cyclin-dependent kinase (Cdk) 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


Describe the general conditions that must be met at each of the three 
main cell cycle checkpoints. 


Solution: 


The G, checkpoint monitors adequate cell growth, the state of the 
genomic DNA, adequate stores of energy, and materials for S phase. 
At the G» checkpoint, DNA is checked to ensure that all chromosomes 
were duplicated and that there are no mistakes in newly synthesized 
DNA. Additionally, cell size and energy reserves are evaluated. The M 
checkpoint confirms the correct attachment of the mitotic spindle 
fibers to the kinetochores. 


Exercise: 
Problem: 


Explain the roles of the positive cell cycle regulators compared to the 
negative regulators. 


Solution: 


Positive cell regulators such as cyclin and Cdk perform tasks that 
advance the cell cycle to the next stage. Negative regulators such as 
Rb, p53, and p21 block the progression of the cell cycle until certain 
events have occurred. 


Exercise: 
Problem: What steps are necessary for Cdk to become fully active? 
Solution: 


Cdk must bind to a cyclin, and it must be phosphorylated in the correct 
position to become fully active. 


Exercise: 


Problem: 


Rb is a negative regulator that blocks the cell cycle at the G, 
checkpoint until the cell achieves a requisite size. What molecular 
mechanism does Rb employ to halt the cell cycle? 


Solution: 


Rb is active when it is dephosphorylated. In this state, Rb binds to E2F, 
which is a transcription factor required for the transcription and 
eventual translation of molecules required for the G,/S transition. E2F 
cannot transcribe certain genes when it is bound to Rb. As the cell 
increases in size, Rb becomes phosphorylated, inactivated, and 
releases E2F. E2F can then promote the transcription of the genes it 
controls, and the transition proteins will be produced. 


Glossary 


cell cycle checkpoint 
mechanism that monitors the preparedness of a eukaryotic cell to 
advance through the various cell cycle stages 


cyclin 
one of a group of proteins that act in conjunction with cyclin- 
dependent kinases to help regulate the cell cycle by phosphorylating 
key proteins; the concentrations of cyclins fluctuate throughout the cell 
cycle 


cyclin-dependent kinase 
one of a group of protein kinases that helps to regulate the cell cycle 
when bound to cyclin; it functions to phosphorylate other proteins that 
are either activated or inactivated by phosphorylation 


p21 
cell cycle regulatory protein that inhibits the cell cycle; its levels are 
controlled by p53 


p53 
cell cycle regulatory protein that regulates cell growth and monitors 
DNA damage; it halts the progression of the cell cycle in cases of 
DNA damage and may induce apoptosis 


retinoblastoma protein (Rb) 
regulatory molecule that exhibits negative effects on the cell cycle by 
interacting with a transcription factor (E2F) 


Cancer and the Cell Cycle 
By the end of this section, you will be able to: 


e Describe how cancer is caused by uncontrolled cell growth 

e Understand how proto-oncogenes are normal cell genes that, when 
mutated, become oncogenes 

e Describe how tumor suppressors function 

e Explain how mutant tumor suppressors cause cancer 


Cancer comprises many different diseases caused by a common 
mechanism: uncontrolled cell growth. Despite the redundancy and 
overlapping levels of cell cycle control, errors do occur. One of the critical 
processes monitored by the cell cycle checkpoint surveillance mechanism is 
the proper replication of DNA during the S phase. Even when all of the cell 
cycle controls are fully functional, a small percentage of replication errors 
(mutations) will be passed on to the daughter cells. If changes to the DNA 
nucleotide sequence occur within a coding portion of a gene and are not 
corrected, a gene mutation results. All cancers start when a gene mutation 
gives rise to a faulty protein that plays a key role in cell reproduction. The 
change in the cell that results from the malformed protein may be minor: 
perhaps a slight delay in the binding of Cdk to cyclin or an Rb protein that 
detaches from its target DNA while still phosphorylated. Even minor 
mistakes, however, may allow subsequent mistakes to occur more readily. 
Over and over, small uncorrected errors are passed from the parent cell to 
the daughter cells and amplified as each generation produces more non- 
functional proteins from uncorrected DNA damage. Eventually, the pace of 
the cell cycle speeds up as the effectiveness of the control and repair 
mechanisms decreases. Uncontrolled growth of the mutated cells outpaces 
the growth of normal cells in the area, and a tumor (“-oma”) can result. 


Proto-oncogenes 


The genes that code for the positive cell cycle regulators are called proto- 
oncogenes. Proto-oncogenes are normal genes that, when mutated in certain 
ways, become oncogenes, genes that cause a cell to become cancerous. 
Consider what might happen to the cell cycle in a cell with a recently 
acquired oncogene. In most instances, the alteration of the DNA sequence 


will result in a less functional (or non-functional) protein. The result is 
detrimental to the cell and will likely prevent the cell from completing the 
cell cycle; however, the organism is not harmed because the mutation will 
not be carried forward. If a cell cannot reproduce, the mutation is not 
propagated and the damage is minimal. Occasionally, however, a gene 
mutation causes a change that increases the activity of a positive regulator. 
For example, a mutation that allows Cdk to be activated without being 
partnered with cyclin could push the cell cycle past a checkpoint before all 
of the required conditions are met. If the resulting daughter cells are too 
damaged to undergo further cell divisions, the mutation would not be 
propagated and no harm would come to the organism. However, if the 
atypical daughter cells are able to undergo further cell divisions, subsequent 
generations of cells will probably accumulate even more mutations, some 
possibly in additional genes that regulate the cell cycle. 


The Cdk gene in the above example is only one of many genes that are 
considered proto-oncogenes. In addition to the cell cycle regulatory 
proteins, any protein that influences the cycle can be altered in such a way 
as to override cell cycle checkpoints. An oncogene is any gene that, when 
altered, leads to an increase in the rate of cell cycle progression. 


Tumor Suppressor Genes 


Like proto-oncogenes, many of the negative cell cycle regulatory proteins 
were discovered in cells that had become cancerous. Tumor suppressor 
genes are segments of DNA that code for negative regulator proteins, the 
type of regulators that, when activated, can prevent the cell from 
undergoing uncontrolled division. The collective function of the best- 
understood tumor suppressor gene proteins, Rb, p53, and p21, is to put up a 
roadblock to cell cycle progression until certain events are completed. A 
cell that carries a mutated form of a negative regulator might not be able to 
halt the cell cycle if there is a problem. Tumor suppressors are similar to 
brakes in a vehicle: Malfunctioning brakes can contribute to a car crash. 


Mutated p53 genes have been identified in more than one-half of all human 
tumor cells. This discovery is not surprising in light of the multiple roles 
that the p53 protein plays at the G, checkpoint. A cell with a faulty p53 may 


fail to detect errors present in the genomic DNA ((link]). Even if a partially 
functional p53 does identify the mutations, it may no longer be able to 
signal the necessary DNA repair enzymes. Either way, damaged DNA will 
remain uncorrected. At this point, a functional p53 will deem the cell 
unsalvageable and trigger programmed cell death (apoptosis). The damaged 
version of p53 found in cancer cells, however, cannot trigger apoptosis. 


Note: 
Art Connection 


! 


Cells can become cancerous 


When cellular damage occurs. P53 Mutated p53 does not arrest the cell 
arrests the cell cycle until the damage cycle. The damaged cell continues to 
is repaired. If damage cannot be divide, which may result in cancer. 
repaired, apoptosis occurs. 


The role of normal p53 is to monitor DNA and the 
supply of oxygen (hypoxia is a condition of reduced 
oxygen supply). If damage is detected, p53 triggers 
repair mechanisms. If repairs are unsuccessful, p53 

signals apoptosis. A cell with an abnormal p53 protein 
cannot repair damaged DNA and thus cannot signal 
apoptosis. Cells with abnormal p53 can become 


cancerous. (credit: modification of work by Thierry 
Soussi) 


Human papillomavirus can cause cervical cancer. The virus encodes E6, a 
protein that binds p53. Based on this fact and what you know about p53, 
what effect do you think E6 binding has on p53 activity? 


a. E6 activates p53 

b. E6 inactivates p53 

c. E6 mutates p53 

d. E6 binding marks p53 for degradation 


The loss of p53 function has other repercussions for the cell cycle. Mutated 
po3 might lose its ability to trigger p21 production. Without adequate levels 
of p21, there is no effective block on Cdk activation. Essentially, without a 
fully functional p53, the G, checkpoint is severely compromised and the 
cell proceeds directly from G, to S regardless of internal and external 
conditions. At the completion of this shortened cell cycle, two daughter 
cells are produced that have inherited the mutated p53 gene. Given the non- 
optimal conditions under which the parent cell reproduced, it is likely that 
the daughter cells will have acquired other mutations in addition to the 
faulty tumor suppressor gene. Cells such as these daughter cells quickly 
accumulate both oncogenes and non-functional tumor suppressor genes. 
Again, the result is tumor growth. 


Note: 
Link to Learning 
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Watch an animation of how cancer results from errors in the cell cycle. 
https://www.openstaxcollege.org/I/cancer 


Section Summary 


Cancer is the result of unchecked cell division caused by a breakdown of 
the mechanisms that regulate the cell cycle. The loss of control begins with 
a change in the DNA sequence of a gene that codes for one of the 
regulatory molecules. Faulty instructions lead to a protein that does not 
function as it should. Any disruption of the monitoring system can allow 
other mistakes to be passed on to the daughter cells. Each successive cell 
division will give rise to daughter cells with even more accumulated 
damage. Eventually, all checkpoints become nonfunctional, and rapidly 
reproducing cells crowd out normal cells, resulting in a tumor or leukemia 
(blood cancer). 


Art Connections 


Exercise: 


Problem: 


[link] Human papillomavirus can cause cervical cancer. The virus 
encodes E6, a protein that binds p53. Based on this fact and what you 
know about p53, what effect do you think E6 binding has on p53 
activity? 


a. E6 activates p53 
b. E6 inactivates p53 
c. E6 mutates p53 


d. E6 binding marks p53 for degradation 


Solution: 


[link] D. E6 binding marks p53 for degradation. 


Review Questions 


Exercise: 
Problem: 


are changes to the order of nucleotides in a segment of 
DNA that codes for a protein. 


a. Proto-oncogenes 

b. Tumor suppressor genes 
c. Gene mutations 

d. Negative regulators 


Solution: 


C 
Exercise: 


Problem: 


A gene that codes for a positive cell cycle regulator is called a(n) 


a. kinase inhibitor. 

b. tumor suppressor gene. 
c. proto-oncogene. 

d. oncogene. 


Solution: 


C 
Exercise: 


Problem: 


A mutated gene that codes for an altered version of Cdk that is active 
in the absence of cyclin is a(n) 


a. kinase inhibitor. 

b. tumor suppressor gene. 
c. proto-oncogene. 

d. oncogene. 


Solution: 


D 


Exercise: 


Problem: Which molecule is a Cdk inhibitor that is controlled by p53? 
a. cyclin 
b. anti-kinase 
Gc Ro 
d. p21 


Solution: 


D 


Free Response 


Exercise: 


Problem: Outline the steps that lead to a cell becoming cancerous. 


Solution: 


If one of the genes that produces regulator proteins becomes mutated, 
it produces a malformed, possibly non-functional, cell cycle regulator, 
increasing the chance that more mutations will be left unrepaired in the 
cell. Each subsequent generation of cells sustains more damage. The 
cell cycle can speed up as a result of the loss of functional checkpoint 
proteins. The cells can lose the ability to self-destruct and eventually 
become “immortalized.” 


Exercise: 


Problem: 


Explain the difference between a proto-oncogene and a tumor 
suppressor gene. 


Solution: 


A proto-oncogene is a segment of DNA that codes for one of the 
positive cell cycle regulators. If that gene becomes mutated so that it 
produces a hyperactivated protein product, it is considered an 
oncogene. A tumor suppressor gene is a segment of DNA that codes 
for one of the negative cell cycle regulators. If that gene becomes 
mutated so that the protein product becomes less active, the cell cycle 
will run unchecked. A single oncogene can initiate abnormal cell 
divisions; however, tumor suppressors lose their effectiveness only 
when both copies of the gene are damaged. 


Exercise: 


Problem: 


List the regulatory mechanisms that might be lost in a cell producing 
faulty p53. 


Solution: 


Regulatory mechanisms that might be lost include monitoring of the 
quality of the genomic DNA, recruiting of repair enzymes, and the 
triggering of apoptosis. 


Exercise: 


Problem: 


p53 can trigger apoptosis if certain cell cycle events fail. How does 
this regulatory outcome benefit a multicellular organism? 


Solution: 


If a cell has damaged DNA, the likelihood of producing faulty proteins 
is higher. The daughter cells of such a damaged parent cell would also 
produce faulty proteins that might eventually become cancerous. If 
p53 recognizes this damage and triggers the cell to self-destruct, the 
damaged DNA is degraded and recycled. No further harm comes to the 
organism. Another healthy cell is triggered to divide instead. 


Glossary 


oncogene 
mutated version of a normal gene involved in the positive regulation of 
the cell cycle 


proto-oncogene 
normal gene that when mutated becomes an oncogene 


tumor suppressor gene 
segment of DNA that codes for regulator proteins that prevent the cell 
from undergoing uncontrolled division 


Propagation of the Signal 
By the end of this section, you will be able to: 


e Explain how the binding of a ligand initiates signal transduction 
throughout a cell 

e Recognize the role of phosphorylation in the transmission of 
intracellular signals 

e Evaluate the role of second messengers in signal transmission 


Once a ligand binds to a receptor, the signal is transmitted through the 
membrane and into the cytoplasm. Continuation of a signal in this manner 
is called signal transduction. Signal transduction only occurs with cell- 
surface receptors because internal receptors are able to interact directly with 
DNA in the nucleus to initiate protein synthesis. 


When a ligand binds to its receptor, conformational changes occur that 
affect the receptor’s intracellular domain. Conformational changes of the 
extracellular domain upon ligand binding can propagate through the 
membrane region of the receptor and lead to activation of the intracellular 
domain or its associated proteins. In some cases, binding of the ligand 
causes dimerization of the receptor, which means that two receptors bind 
to each other to form a stable complex called a dimer. A dimer is a 
chemical compound formed when two molecules (often identical) join 
together. The binding of the receptors in this manner enables their 
intracellular domains to come into close contact and activate each other. 


Binding Initiates a Signaling Pathway 


After the ligand binds to the cell-surface receptor, the activation of the 
receptor’s intracellular components sets off a chain of events that is called a 
signaling pathway or a signaling cascade. In a signaling pathway, second 
messengers, enzymes, and activated proteins interact with specific proteins, 
which are in turn activated in a chain reaction that eventually leads to a 
change in the cell’s environment ({link]). The events in the cascade occur in 
a series, much like a current flows in a river. Interactions that occur before a 
certain point are defined as upstream events, and events after that point are 
called downstream events. 


Note: 


Art Connection 

Upon binding 
of epidermal 
growth factor 
(EGF) to the 
EGF receptor 
(EGFR), two 
proteins 
associated with 
the receptor 
called GRB2 
and SOS 
activate RAS, 
asmall 
G-protein. 


A protein 
kinase called 
RAF is 
activated by 
RAS-GTP. RAF 
phosphorylates 
A) MEK, which 

in turn 
phosphorylates 
ERK, a MAP 
kinase. The 
phosphorylated 
ERK enters the 
nucleus, where 
it triggers 

a cellular 


Stimulates response. 


Translation 


Nucleus 


Stimulates: 

cell proliferation 

cell migration and adhesion 
angiogenesis (growth of new blood 
vessels) 


Inhibits: 
apoptosis 


The epidermal growth factor 
(EGF) receptor (EGFR) is a 
receptor tyrosine kinase 
involved in the regulation of 
cell growth, wound healing, 
and tissue repair. When EGF 
binds to the EGFR, a cascade 
of downstream events causes 
the cell to grow and divide. If 
EGFR is activated at 


inappropriate times, 
uncontrolled cell growth 
(cancer) may occur. 


In certain cancers, the GTPase activity of the RAS G-protein is inhibited. 
This means that the RAS protein can no longer hydrolyze GTP into GDP. 
What effect would this have on downstream cellular events? 


Signaling pathways can get very complicated very quickly because most 
cellular proteins can affect different downstream events, depending on the 
conditions within the cell. A single pathway can branch off toward different 
endpoints based on the interplay between two or more signaling pathways, 
and the same ligands are often used to initiate different signals in different 
cell types. This variation in response is due to differences in protein 
expression in different cell types. Another complicating element is signal 
integration of the pathways, in which signals from two or more different 
cell-surface receptors merge to activate the same response in the cell. This 
process can ensure that multiple external requirements are met before a cell 
commits to a specific response. 


The effects of extracellular signals can also be amplified by enzymatic 
cascades. At the initiation of the signal, a single ligand binds to a single 
receptor. However, activation of a receptor-linked enzyme can activate 
many copies of a component of the signaling cascade, which amplifies the 
signal. 


Methods of Intracellular Signaling 


The induction of a signaling pathway depends on the modification of a 
cellular component by an enzyme. There are numerous enzymatic 
modifications that can occur, and they are recognized in turn by the next 
component downstream. The following are some of the more common 
events in intracellular signaling. 


Note: 
Link to Learning 
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Observe an animation of cell signaling at this site. 


Phosphorylation 


One of the most common chemical modifications that occurs in signaling 
pathways is the addition of a phosphate group (PO,-°) to a molecule such as 
a protein in a process called phosphorylation. The phosphate can be added 
to a nucleotide such as GMP to form GDP or GTP. Phosphates are also 
often added to serine, threonine, and tyrosine residues of proteins, where 
they replace the hydroxyl group of the amino acid ([link]). The transfer of 
the phosphate is catalyzed by an enzyme called a kinase. Various kinases 
are named for the substrate they phosphorylate. Phosphorylation of serine 
and threonine residues often activates enzymes. Phosphorylation of tyrosine 
residues can either affect the activity of an enzyme or create a binding site 
that interacts with downstream components in the signaling cascade. 
Phosphorylation may activate or inactivate enzymes, and the reversal of 
phosphorylation, dephosphorylation by a phosphatase, will reverse the 
effect. 
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Phosphotyrosine 


In protein phosphorylation, a 
phosphate group (PO, ) is 
added to residues of the 
amino acids serine, threonine, 
and tyrosine. 


Second Messengers 


Second messengers are small molecules that propagate a signal after it has 
been initiated by the binding of the signaling molecule to the receptor. 
These molecules help to spread a signal through the cytoplasm by altering 
the behavior of certain cellular proteins. 


Calcium ion is a widely used second messenger. The free concentration of 
calcium ions (Ca**) within a cell is very low because ion pumps in the 
plasma membrane continuously use adenosine-5'-triphosphate (ATP) to 
remove it. For signaling purposes, Ca?* is stored in cytoplasmic vesicles, 
such as the endoplasmic reticulum, or accessed from outside the cell. When 
signaling occurs, ligand-gated calcium ion channels allow the higher levels 


of Ca** that are present outside the cell (or in intracellular storage 
compartments) to flow into the cytoplasm, which raises the concentration of 
cytoplasmic Ca**. The response to the increase in Ca** varies, depending 
on the cell type involved. For example, in the B-cells of the pancreas, Ca?* 
signaling leads to the release of insulin, and in muscle cells, an increase in 
Ca** leads to muscle contractions. 


Another second messenger utilized in many different cell types is cyclic 
AMP (cAMP). Cyclic AMP is synthesized by the enzyme adenylyl cyclase 
from ATP ([link]). The main role of cAMP in cells is to bind to and activate 
an enzyme called cAMP-dependent kinase (A-kinase). A-kinase regulates 
many vital metabolic pathways: It phosphorylates serine and threonine 
residues of its target proteins, activating them in the process. A-kinase is 
found in many different types of cells, and the target proteins in each kind 
of cell are different. Differences give rise to the variation of the responses to 
cAMP in different cells. 
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This diagram shows the mechanism for the 
formation of cyclic AMP (cAMP). cAMP serves 
as a second messenger to activate or inactivate 
proteins within the cell. Termination of the signal 
occurs when an enzyme called phosphodiesterase 
converts CAMP into AMP. 


Present in small concentrations in the plasma membrane, inositol 
phospholipids are lipids that can also be converted into second messengers. 
Because these molecules are membrane components, they are located near 
membrane-bound receptors and can easily interact with them. 
Phosphatidylinositol (PI) is the main phospholipid that plays a role in 
cellular signaling. Enzymes known as kinases phosphorylate PI to form PI- 
phosphate (PIP) and PI-bisphosphate (PIP>). 


The enzyme phospholipase C cleaves PIP> to form diacylglycerol (DAG) 
and inositol triphosphate (IP 3) ({link]). These products of the cleavage of 
PIP, serve as second messengers. Diacylglycerol (DAG) remains in the 
plasma membrane and activates protein kinase C (PKC), which then 
phosphorylates serine and threonine residues in its target proteins. IP3 
diffuses into the cytoplasm and binds to ligand-gated calcium channels in 
the endoplasmic reticulum to release Ca** that continues the signal cascade. 
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The enzyme phospholipase C breaks down 
PIP, into IP3 and DAG, both of which serve 


as second messengers. 


Section Summary 


Ligand binding to the receptor allows for signal transduction through the 
cell. The chain of events that conveys the signal through the cell is called a 
signaling pathway or cascade. Signaling pathways are often very complex 
because of the interplay between different proteins. A major component of 
cell signaling cascades is the phosphorylation of molecules by enzymes 
known as kinases. Phosphorylation adds a phosphate group to serine, 
threonine, and tyrosine residues in a protein, changing their shapes, and 
activating or inactivating the protein. Small molecules like nucleotides can 
also be phosphorylated. Second messengers are small, non-protein 
molecules that are used to transmit a signal within a cell. Some examples of 
second messengers are calcium ions (Ca?*), cyclic AMP (cAMP), 
diacylglycerol (DAG), and inositol triphosphate (IP3). 


Art Connections 


Exercise: 


Problem: 


[link] In certain cancers, the GTPase activity of the RAS G-protein is 
inhibited. This means that the RAS protein can no longer hydrolyze 
GTP into GDP. What effect would this have on downstream cellular 
events? 


Solution: 
[link] ERK would become permanently activated, resulting in cell 


proliferation, migration, adhesion, and the growth of new blood 
vessels. Apoptosis would be inhibited. 


Review Questions 


Exercise: 


Problem: Where do DAG and IP3 originate? 


a. They are formed by phosphorylation of cAMP. 

b. They are ligands expressed by signaling cells. 

c. They are hormones that diffuse through the plasma membrane to 
stimulate protein production. 

d. They are the cleavage products of the inositol phospholipid, PIP». 


Solution: 


D 
Exercise: 


Problem: 


What property enables the residues of the amino acids serine, 
threonine, and tyrosine to be phosphorylated? 


a. They are polar. 

b. They are non-polar. 

c. They contain a hydroxyl group. 

d. They occur more frequently in the amino acid sequence of 
signaling proteins. 


Solution: 


C 


Free Response 


Exercise: 


Problem: 


The same second messengers are used in many different cells, but the 
response to second messengers is different in each cell. How is this 
possible? 


Solution: 


Different cells produce different proteins, including cell-surface 
receptors and signaling pathway components. Therefore, they respond 
to different ligands, and the second messengers activate different 
pathways. Signal integration can also change the end result of 
signaling. 


Exercise: 


Problem: 


What would happen if the intracellular domain of a cell-surface 
receptor was switched with the domain from another receptor? 


Solution: 


The binding of the ligand to the extracellular domain would activate 
the pathway normally activated by the receptor donating the 
intracellular domain. 


Glossary 


cyclic AMP (cAMP) 
second messenger that is derived from ATP 


cyclic AMP-dependent kinase 
(also, protein kinase A, or PKA) kinase that is activated by binding to 
cAMP 


diacylglycerol (DAG) 


cleavage product of PIP, that is used for signaling within the plasma 
membrane 


dimer 
chemical compound formed when two molecules join together 


dimerization 
(of receptor proteins) interaction of two receptor proteins to form a 
functional complex called a dimer 


inositol phospholipid 
lipid present at small concentrations in the plasma membrane that is 
converted into a second messenger; it has inositol (a carbohydrate) as 
its hydrophilic head group 


inositol triphosphate (IP3) 
cleavage product of PIP, that is used for signaling within the cell 


kinase 
enzyme that catalyzes the transfer of a phosphate group from ATP to 
another molecule 


second messenger 
small, non-protein molecule that propagates a signal within the cell 
after activation of a receptor causes its release 


signal integration 
interaction of signals from two or more different cell-surface receptors 
that merge to activate the same response in the cell 


signal transduction 
propagation of the signal through the cytoplasm (and sometimes also 
the nucleus) of the cell 


signaling pathway 
(also signaling cascade) chain of events that occurs in the cytoplasm of 
the cell to propagate the signal from the plasma membrane to produce 
a response 


Response to the Signal 
By the end of this section, you will be able to: 


e Describe how signaling pathways direct protein expression, cellular 
metabolism, and cell growth 

e Identify the function of PKC in signal transduction pathways 

e Recognize the role of apoptosis in the development and maintenance 
of a healthy organism 


Inside the cell, ligands bind to their internal receptors, allowing them to 
directly affect the cell’s DNA and protein-producing machinery. Using 
signal transduction pathways, receptors in the plasma membrane produce a 
variety of effects on the cell. The results of signaling pathways are 
extremely varied and depend on the type of cell involved as well as the 
external and internal conditions. A small sampling of responses is described 
below. 


Gene Expression 


Some signal transduction pathways regulate the transcription of RNA. 
Others regulate the translation of proteins from mRNA. An example of a 
protein that regulates translation in the nucleus is the MAP kinase ERK. 
ERK is activated in a phosphorylation cascade when epidermal growth 
factor (EGF) binds the EGF receptor (see [link]). Upon phosphorylation, 
ERK enters the nucleus and activates a protein kinase that, in turn, regulates 
protein translation ({link]). 


The MAP kinase ERK 
phosphorylates MNK1. 
MNK1 in turn 
phosphorylates elF-4E, 
which is associated 
with MRNA. The mRNA 
unfolds and protein 
sythesis begins. 


es 
Protein synthesis 


ERK is a MAP kinase that 
activates translation when it is 
phosphorylated. ERK 
phosphorylates MNK1, which in 
turn phosphorylates eIF-4E, an 
elongation initiation factor that, 
with other initiation factors, is 
associated with mRNA. When 
eIF-4E becomes phosphorylated, 
the mRNA unfolds, allowing 
protein synthesis in the nucleus to 
begin. (See [link] for the 
phosphorylation pathway that 
activates ERK.) 


The second kind of protein with which PKC can interact is a protein that 
acts as an inhibitor. An inhibitor is a molecule that binds to a protein and 
prevents it from functioning or reduces its function. In this case, the 


inhibitor is a protein called Ik-B, which binds to the regulatory protein NF- 
kB. (The symbol k represents the Greek letter kappa.) When Ik-B is bound 
to NF-«B, the complex cannot enter the nucleus of the cell, but when Ik-B 
is phosphorylated by PKC, it can no longer bind NF-«B, and NF-x«B (a 
transcription factor) can enter the nucleus and initiate RNA transcription. In 
this case, the effect of phosphorylation is to inactivate an inhibitor and 
thereby activate the process of transcription. 


Increase in Cellular Metabolism 


The result of another signaling pathway affects muscle cells. The activation 
of $-adrenergic receptors in muscle cells by adrenaline leads to an increase 
in cyclic AMP (cAMP) inside the cell. Also known as epinephrine, 
adrenaline is a hormone (produced by the adrenal gland attached to the 
kidney) that readies the body for short-term emergencies. Cyclic AMP 
activates PKA (protein kinase A), which in turn phosphorylates two 
enzymes. The first enzyme promotes the degradation of glycogen by 
activating intermediate glycogen phosphorylase kinase (GPK) that in turn 
activates glycogen phosphorylase (GP) that catabolizes glycogen into 
glucose. (Recall that your body converts excess glucose to glycogen for 
short-term storage. When energy is needed, glycogen is quickly reconverted 
to glucose.) Phosphorylation of the second enzyme, glycogen synthase 
(GS), inhibits its ability to form glycogen from glucose. In this manner, a 
muscle cell obtains a ready pool of glucose by activating its formation via 
glycogen degradation and by inhibiting the use of glucose to form 
glycogen, thus preventing a futile cycle of glycogen degradation and 
synthesis. The glucose is then available for use by the muscle cell in 
response to a sudden surge of adrenaline—the “fight or flight” reflex. 


Cell Growth 


Cell signaling pathways also play a major role in cell division. Cells do not 
normally divide unless they are stimulated by signals from other cells. The 
ligands that promote cell growth are called growth factors. Most growth 
factors bind to cell-surface receptors that are linked to tyrosine kinases. 
These cell-surface receptors are called receptor tyrosine kinases (RTKs). 


Activation of RTKs initiates a signaling pathway that includes a G-protein 
called RAS, which activates the MAP kinase pathway described earlier. The 
enzyme MAP kinase then stimulates the expression of proteins that interact 
with other cellular components to initiate cell division. 


Note: 

Career Connection 

Cancer Biologist 

Cancer biologists study the molecular origins of cancer with the goal of 
developing new prevention methods and treatment strategies that will 
inhibit the growth of tumors without harming the normal cells of the body. 
As mentioned earlier, signaling pathways control cell growth. These 
signaling pathways are controlled by signaling proteins, which are, in turn, 
expressed by genes. Mutations in these genes can result in malfunctioning 
signaling proteins. This prevents the cell from regulating its cell cycle, 
triggering unrestricted cell division and cancer. The genes that regulate the 
signaling proteins are one type of oncogene which is a gene that has the 
potential to cause cancer. The gene encoding RAS is an oncogene that was 
originally discovered when mutations in the RAS protein were linked to 
cancer. Further studies have indicated that 30 percent of cancer cells have a 
mutation in the RAS gene that leads to uncontrolled growth. If left 
unchecked, uncontrolled cell division can lead tumor formation and 
metastasis, the growth of cancer cells in new locations in the body. 

Cancer biologists have been able to identify many other oncogenes that 
contribute to the development of cancer. For example, HER2 is a cell- 
surface receptor that is present in excessive amounts in 20 percent of 
human breast cancers. Cancer biologists realized that gene duplication led 
to HER2 overexpression in 25 percent of breast cancer patients and 
developed a drug called Herceptin (trastuzumab). Herceptin is a 
monoclonal antibody that targets HER2 for removal by the immune 
system. Herceptin therapy helps to control signaling through HER2. The 
use of Herceptin in combination with chemotherapy has helped to increase 
the overall survival rate of patients with metastatic breast cancer. 

More information on cancer biology research can be found at the National 
Cancer Institute website 


(http://www.cancer. gov/cancertopics/understandingcancer/targetedtherapie 


S). 


Cell Death 


When a cell is damaged, superfluous, or potentially dangerous to an 
organism, a cell can initiate a mechanism to trigger programmed cell death, 
or apoptosis. Apoptosis allows a cell to die in a controlled manner that 
prevents the release of potentially damaging molecules from inside the cell. 
There are many internal checkpoints that monitor a cell’s health; if 
abnormalities are observed, a cell can spontaneously initiate the process of 
apoptosis. However, in some cases, such as a viral infection or uncontrolled 
cell division due to cancer, the cell’s normal checks and balances fail. 
External signaling can also initiate apoptosis. For example, most normal 
animal cells have receptors that interact with the extracellular matrix, a 
network of glycoproteins that provides structural support for cells in an 
organism. The binding of cellular receptors to the extracellular matrix 
initiates a signaling cascade within the cell. However, if the cell moves 
away from the extracellular matrix, the signaling ceases, and the cell 
undergoes apoptosis. This system keeps cells from traveling through the 
body and proliferating out of control, as happens with tumor cells that 
metastasize. 


Another example of external signaling that leads to apoptosis occurs in T- 
cell development. T-cells are immune cells that bind to foreign 
macromolecules and particles, and target them for destruction by the 
immune system. Normally, T-cells do not target “self” proteins (those of 
their own organism), a process that can lead to autoimmune diseases. In 
order to develop the ability to discriminate between self and non-self, 
immature T-cells undergo screening to determine whether they bind to so- 
called self proteins. If the T-cell receptor binds to self proteins, the cell 
initiates apoptosis to remove the potentially dangerous cell. 


Apoptosis is also essential for normal embryological development. In 
vertebrates, for example, early stages of development include the formation 
of web-like tissue between individual fingers and toes ([link]). During the 


course of normal development, these unneeded cells must be eliminated, 
enabling fully separated fingers and toes to form. A cell signaling 
mechanism triggers apoptosis, which destroys the cells between the 
developing digits. 


The histological 
section of a foot of 
a 15-day-old mouse 
embryo, visualized 

using light 
microscopy, reveals 

areas of tissue 
between the toes, 
which apoptosis 

will eliminate 
before the mouse 

reaches its full 
gestational age at 

27 days. (credit: 

modification of 


work by Michal 
Manas) 


Termination of the Signal Cascade 


The aberrant signaling often seen in tumor cells is proof that the termination 
of a signal at the appropriate time can be just as important as the initiation 
of a signal. One method of stopping a specific signal is to degrade the 
ligand or remove it so that it can no longer access its receptor. One reason 
that hydrophobic hormones like estrogen and testosterone trigger long- 
lasting events is because they bind carrier proteins. These proteins allow the 
insoluble molecules to be soluble in blood, but they also protect the 
hormones from degradation by circulating enzymes. 


Inside the cell, many different enzymes reverse the cellular modifications 
that result from signaling cascades. For example, phosphatases are 
enzymes that remove the phosphate group attached to proteins by kinases in 
a process called dephosphorylation. Cyclic AMP (cAMP) is degraded into 
AMP by phosphodiesterase, and the release of calcium stores is reversed 
by the Ca** pumps that are located in the external and internal membranes 
of the cell. 


Section Summary 


The initiation of a signaling pathway is a response to external stimuli. This 
response can take many different forms, including protein synthesis, a 
change in the cell’s metabolism, cell growth, or even cell death. Many 
pathways influence the cell by initiating gene expression, and the methods 
utilized are quite numerous. Some pathways activate enzymes that interact 
with DNA transcription factors. Others modify proteins and induce them to 
change their location in the cell. Depending on the status of the organism, 
cells can respond by storing energy as glycogen or fat, or making it 
available in the form of glucose. A signal transduction pathway allows 
muscle cells to respond to immediate requirements for energy in the form of 
glucose. Cell growth is almost always stimulated by external signals called 
growth factors. Uncontrolled cell growth leads to cancer, and mutations in 


the genes encoding protein components of signaling pathways are often 
found in tumor cells. Programmed cell death, or apoptosis, is important for 
removing damaged or unnecessary cells. The use of cellular signaling to 
organize the dismantling of a cell ensures that harmful molecules from the 
cytoplasm are not released into the spaces between cells, as they are in 
uncontrolled death, necrosis. Apoptosis also ensures the efficient recycling 
of the components of the dead cell. Termination of the cellular signaling 
cascade is very important so that the response to a signal is appropriate in 
both timing and intensity. Degradation of signaling molecules and 
dephosphorylation of phosphorylated intermediates of the pathway by 
phosphatases are two ways to terminate signals within the cell. 


Review Questions 


Exercise: 


Problem: What is the function of a phosphatase? 


a. A phosphatase removes phosphorylated amino acids from 
proteins. 

b. A phosphatase removes the phosphate group from phosphorylated 
amino acid residues in a protein. 

c. A phosphatase phosphorylates serine, threonine, and tyrosine 
residues. 

d. A phosphatase degrades second messengers in the cell. 


Solution: 
B 
Exercise: 
Problem:How does NF-kB induce gene expression? 


a. A small, hydrophobic ligand binds to NF-«B, activating it. 


b. Phosphorylation of the inhibitor Ik-B dissociates the complex 
between it and NF-«B, and allows NF-«B to enter the nucleus and 
stimulate transcription. 

c. NF-«B is phosphorylated and is then free to enter the nucleus and 
bind DNA. 

d. NF-«B is a kinase that phosphorylates a transcription factor that 
binds DNA and promotes protein production. 


Solution: 


B 
Exercise: 


Problem: 
Apoptosis can occur in a cell when the cell is 


a. damaged 

b. no longer needed 
c. infected by a virus 
d. all of the above 


Solution: 
D 
Exercise: 
Problem: What is the effect of an inhibitor binding an enzyme? 
a. The enzyme is degraded. 
b. The enzyme is activated. 


c. The enzyme is inactivated. 
d. The complex is transported out of the cell. 


Solution: 


Free Response 


Exercise: 
Problem: 


What is a possible result of a mutation in a kinase that controls a 
pathway that stimulates cell growth? 


Solution: 


If a kinase is mutated so that it is always activated, it will continuously 
signal through the pathway and lead to uncontrolled growth and 
possibly cancer. If a kinase is mutated so that it cannot function, the 
cell will not respond to ligand binding. 


Exercise: 


Problem: 


How does the extracellular matrix control the growth of cells? 
Solution: 


Receptors on the cell surface must be in contact with the extracellular 
matrix in order to receive positive signals that allow the cell to live. If 
the receptors are not activated by binding, the cell will undergo 
apoptosis. This ensures that cells are in the correct place in the body 
and helps to prevent invasive cell growth as occurs in metastasis in 
cancer. 


Glossary 


apoptosis 
programmed cell death 


growth factor 
ligand that binds to cell-surface receptors and stimulates cell growth 


inhibitor 
molecule that binds to a protein (usually an enzyme) and keeps it from 
functioning 


phosphatase 
enzyme that removes the phosphate group from a molecule that has 
been previously phosphorylated 


phosphodiesterase 
enzyme that degrades cAMP, producing AMP, to terminate signaling 


ExperimentalUpload 
Lecture Notes Not Interesting Very Rough 


LEARN.GENETICS.UTAH --Making a Transgenic Mouse 


SOURCE 


http://learn.genetics.utah.edu/content/science/transgenic/ 


Mario Caoechhi 


Mario R. Capecchi, Ph.D., of the University of Utah, won the 2007 Nobel 
Prize in Physiology or Medicine. Capecchi shared the prize with Oliver 
Smithies of University of North Carolina, Chapel Hill, and Sir Martin 
Evans of Cardiff University in the UK. 


The prize recognized Capecchi's pioneering work on "knockout mouse" 
technology, a gene-targeting technique that has revolutionized genetic and 
biomedical research, allowing scientists to create animal models for 
hundreds of human diseases. 


As a child, Capecchi wandered homeless in Italy. As a researcher, his first 
attempts at gene targeting were deemed not ready for funding by the 
National Institutes of Health. Capecchi is an individual whose personal life 
proves that while some events are not probable, anything is possible. Read 
Mario's story. 


During the 1980s, Capecchi devised a way to change or remove any single 
gene in the mouse genome, creating strains of mice that pass the altered 
gene from parent to offspring. In the years since, these "transgenic" and 
"knockout" mice have become commonplace in the laboratory. 


Capecchi's pioneering work in gene targeting has taught us much about how 
the body builds—and rebuilds—itself. He has given scientists worldwide 


the tools to make important discoveries about human diseases, from cancer 
to obesity. 


And he has raised a key question for the future of human medicine: if we 
can replace a perfectly good gene with a mutated one, can we also go the 
other way, replacing problem genes with those that work? 


Mario’s Lab at Work 


What makes an arm an arm? Capecchi's research team is working on 
answering that question using gene targeting. They have systematically 
"knocked out" a set of genes in mice, called homeotic genes, which govern 
body patterning during development. For example, one of the lab's most 
recent genetic discoveries may explain why we lack spare ribs. Find out 
more about how homeotic genes work inGenes Determine Body Patterns. 


YOUR GOAL: You are studying how a particular gene, named OhNo, 
might play a role in panic attacks. You want to study mice that are missing 
this gene. To "knockout" the OhNo gene, you will replace it with a mutated 
copy that doesn't work. 


Here's how: 
1. Isolate Stem Cells 


Isolate embryonic stem cells that originated from male brown mice with a 
normal OhNo gene (blue). 


2. Add Inactive Gene With Marker 


To these cells, add a copy containing a mutated, inactive OhNo gene (red), 
and a drug resistance marker gene (p3. Similar Genes Naturally Swap 


3. Similar Genes Naturally Swap 


By mechanisms that are not completely understood yet, similar genes will 
swap places. The OhNo gene plus drug resistance marker gene is 


incorporated into the genome, and the normal version is kicked out. This 
process is called homologous recombination. 


ink). 
4. Add Drug 


Cells that haven't incorporated the inactive OhNo gene don't have the drug 
resistance marker gene (pink). 


Adding the drug kills cells without the marker, leaving you with only cells 
that have an inactive version of the OhNo gene. 


5. Grow Chimeric Mice 


By transplanting stem cells that carry the inactive ohNo gene into a white 
mouse embyro, you'll create what is called a chimera. Chimeras have 
patches of cells throughout their bodies that grew from white mouse cells 
and patches that grew from brown stem cells. Some of the cells that have 
the inactive OhNo gene may develop into reproductive cells. 


Chimeras are easy to identify because they have both brown and white 
patches of fur. 


6. Mate Male Chimera 


If a male chimera has some reproductive cells (sperm) that originated from 
the brown stem cells, he will produce some brown offspring when mated 
with a white female. 


7. Test and Breed Brown Offspring 


Half of the brown offspring will have a copy of the inactive OhNo gene in 
all of their cells—including their reproductive cells. These mice have one 
normal copy of the OhNo gene from their 


mother (not shown) and one inactive copy from their father. So half of their 
reproductive cells will contain a normal copy, and half will contain an 
inactive copy. 


These mice can be identified by performing DNA sequencing in their OhNo 
genes and then bred with each other. 


8. You've Made a Knockout Mouse 


One fourth of your resulting offsping will have two copies of the "knocked- 
out" or inactive OhNo gene. You can now study these mice to determine 
how lacking the OhNo gene may affect panic attacks. 


Supported by a Science Education Partnership Award (SEPA) Grant No. 
R25RRO023288 from the National Center for Research Resources, a 
component of the NIH. The contents provided here are solely the 
responsibility of the authors and do not necessarily represent the official 
views of NIH. 
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Notice how cute the dog is just sitting there. 


Genomics and Proteomics 
By the end of this section, you will be able to: 


e Explain systems biology 
e Describe a proteome 
¢ Define protein signature 


Proteins are the final products of genes, which help perform the function 
encoded by the gene. Proteins are composed of amino acids and play 
important roles in the cell. All enzymes (except ribozymes) are proteins that 
act as catalysts to affect the rate of reactions. Proteins are also regulatory 
molecules, and some are hormones. Transport proteins, such as 
hemoglobin, help transport oxygen to various organs. Antibodies that 
defend against foreign particles are also proteins. In the diseased state, 
protein function can be impaired because of changes at the genetic level or 
because of direct impact on a specific protein. 


A proteome is the entire set of proteins produced by a cell type. Proteomes 
can be studied using the knowledge of genomes because genes code for 
mRNAs, and the mRNAs encode proteins. Although mRNA analysis is a 
step in the right direction, not all mRNAs are translated into proteins. The 
study of the function of proteomes is called proteomics. Proteomics 
complements genomics and is useful when scientists want to test their 
hypotheses that were based on genes. Even though all cells of a 
multicellular organism have the same set of genes, the set of proteins 
produced in different tissues is different and dependent on gene expression. 
Thus, the genome is constant, but the proteome varies and is dynamic 
within an organism. In addition, RNAs can be alternately spliced (cut and 
pasted to create novel combinations and novel proteins) and many proteins 
are modified after translation by processes such as proteolytic cleavage, 
phosphorylation, glycosylation, and ubiquitination. There are also protein- 
protein interactions, which complicate the study of proteomes. Although the 
genome provides a blueprint, the final architecture depends on several 
factors that can change the progression of events that generate the 
proteome. 


Metabolomics is related to genomics and proteomics. Metabolomics 
involves the study of small molecule metabolites found in an organism. The 


metabolome is the complete set of metabolites that are related to the 
genetic makeup of an organism. Metabolomics offers an opportunity to 
compare genetic makeup and physical characteristics, as well as genetic 
makeup and environmental factors. The goal of metabolome research is to 
identify, quantify, and catalogue all of the metabolites that are found in the 
tissues and fluids of living organisms. 


Basic Techniques in Protein Analysis 


The ultimate goal of proteomics is to identify or compare the proteins 
expressed from a given genome under specific conditions, study the 
interactions between the proteins, and use the information to predict cell 
behavior or develop drug targets. Just as the genome is analyzed using the 
basic technique of DNA sequencing, proteomics requires techniques for 
protein analysis. The basic technique for protein analysis, analogous to 
DNA sequencing, is mass spectrometry. Mass spectrometry is used to 
identify and determine the characteristics of a molecule. Advances in 
spectrometry have allowed researchers to analyze very small samples of 
protein. X-ray crystallography, for example, enables scientists to determine 
the three-dimensional structure of a protein crystal at atomic resolution. 
Another protein imaging technique, nuclear magnetic resonance (NMR), 
uses the magnetic properties of atoms to determine the three-dimensional 
structure of proteins in aqueous solution. Protein microarrays have also 
been used to study interactions between proteins. Large-scale adaptations of 
the basic two-hybrid screen ({link]) have provided the basis for protein 
microarrays. Computer software is used to analyze the vast amount of data 
generated for proteomic analysis. 


Genomic- and proteomic-scale analyses are part of systems biology. 
Systems biology is the study of whole biological systems (genomes and 
proteomes) based on interactions within the system. The European 
Bioinformatics Institute and the Human Proteome Organization (HUPO) are 
developing and establishing effective tools to sort through the enormous 
pile of systems biology data. Because proteins are the direct products of 
genes and reflect activity at the genomic level, it is natural to use proteomes 
to compare the protein profiles of different cells to identify proteins and 
genes involved in disease processes. Most pharmaceutical drug trials target 


proteins. Information obtained from proteomics is being used to identify 
novel drugs and understand their mechanisms of action. 


Prey 


Reporter gene 


If the bait protein interacts with the prey protein, the transcription 
factor's activator domain binds to the binding domain, and 
transcription occurs. 


SES 


Reporter gene 


If the prey doesn't catch the bait, no transcription occurs. 


Two-hybrid screening is used to 
determine whether two proteins 
interact. In this method, a 
transcription factor is split into a 
DNA-binding domain (BD) and an 
activator domain (AD). The binding 
domain is able to bind the promoter in 
the absence of the activator domain, 
but it does not turn on transcription. A 
protein called the bait is attached to 
the BD, and a protein called the prey 
is attached to the AD. Transcription 
occurs only if the prey “catches” the 
bait. 


The challenge of techniques used for proteomic analyses is the difficulty in 
detecting small quantities of proteins. Although mass spectrometry is good 
for detecting small amounts of proteins, variations in protein expression in 
diseased states can be difficult to discern. Proteins are naturally unstable 
molecules, which makes proteomic analysis much more difficult than 
genomic analysis. 


Cancer Proteomics 


Genomes and proteomes of patients suffering from specific diseases are 
being studied to understand the genetic basis of the disease. The most 
prominent disease being studied with proteomic approaches is cancer. 
Proteomic approaches are being used to improve screening and early 
detection of cancer; this is achieved by identifying proteins whose 
expression is affected by the disease process. An individual protein is called 
a biomarker, whereas a set of proteins with altered expression levels is 
called a protein signature. For a biomarker or protein signature to be 
useful as a candidate for early screening and detection of a cancer, it must 
be secreted in body fluids, such as sweat, blood, or urine, such that large- 
scale screenings can be performed in a non-invasive fashion. The current 
problem with using biomarkers for the early detection of cancer is the high 
rate of false-negative results. A false negative is an incorrect test result that 
should have been positive. In other words, many cases of cancer go 
undetected, which makes biomarkers unreliable. Some examples of protein 
biomarkers used in cancer detection are CA-125 for ovarian cancer and 
PSA for prostate cancer. Protein signatures may be more reliable than 
biomarkers to detect cancer cells. Proteomics is also being used to develop 
individualized treatment plans, which involves the prediction of whether or 
not an individual will respond to specific drugs and the side effects that the 
individual may experience. Proteomics is also being used to predict the 
possibility of disease recurrence. 


The National Cancer Institute has developed programs to improve the 
detection and treatment of cancer. The Clinical Proteomic Technologies for 
Cancer and the Early Detection Research Network are efforts to identify 
protein signatures specific to different types of cancers. The Biomedical 


Proteomics Program is designed to identify protein signatures and design 
effective therapies for cancer patients. 


Section Summary 


Proteomics is the study of the entire set of proteins expressed by a given 
type of cell under certain environmental conditions. In a multicellular 
organism, different cell types will have different proteomes, and these will 
vary with changes in the environment. Unlike a genome, a proteome is 
dynamic and in constant flux, which makes it both more complicated and 
more useful than the knowledge of genomes alone. 


Proteomics approaches rely on protein analysis; these techniques are 
constantly being upgraded. Proteomics has been used to study different 
types of cancer. Different biomarkers and protein signatures are being used 
to analyze each type of cancer. The future goal is to have a personalized 
treatment plan for each individual. 


Review Questions 


Exercise: 


Problem:What is a biomarker? 


a. the color coding of different genes 

b. a protein that is uniquely produced in a diseased state 
c. amolecule in the genome or proteome 

d. a marker that is genetically inherited 


Solution: 


B 


Exercise: 


Problem:A protein signature is: 


a. the path followed by a protein after it is synthesized in the 
nucleus 

b. the path followed by a protein in the cytoplasm 

c. a protein expressed on the cell surface 

d. a unique set of proteins present in a diseased state 


Solution: 


D 


Free Response 


Exercise: 


Problem: 

How has proteomics been used in cancer detection and treatment? 
Solution: 

Proteomics has provided a way to detect biomarkers and protein 


signatures, which have been used to screen for the early detection of 
cancer. 


Exercise: 


Problem: What is personalized medicine? 
Solution: 
Personalized medicine is the use of an individual's genomic sequence 


to predict the risk for specific diseases. When a disease does occur, it 
can be used to develop a personalized treatment plan. 


Glossary 


biomarker 
individual protein that is uniquely produced in a diseased state 


false negative 
incorrect test result that should have been positive 


metabolome 
complete set of metabolites which are related to the genetic makeup of 
an organism 


metabolomics 
study of small molecule metabolites found in an organism 


protein signature 
set of uniquely expressed proteins in the diseased state 


proteome 
entire set of proteins produced by a cell type 


proteomics 
study of the function of proteomes 


systems biology 
study of whole biological systems (genomes and proteomes) based on 
interactions within the system 


Applying Genomics 
By the end of this section, you will be able to: 


e Explain pharmacogenomics 
¢ Define polygenic 


The introduction of DNA sequencing and whole genome sequencing 
projects, particularly the Human Genome project, has expanded the 
applicability of DNA sequence information. Genomics is now being used in 
a wide variety of fields, such as metagenomics, pharmacogenomics, and 
mitochondrial genomics. The most commonly known application of 
genomics is to understand and find cures for diseases. 


Predicting Disease Risk at the Individual Level 


Predicting the risk of disease involves screening currently healthy 
individuals by genome analysis at the individual level. Intervention with 
lifestyle changes and drugs can be recommended before disease onset. 
However, this approach is most applicable when the problem resides within 
a single gene defect. Such defects only account for approximately 5 percent 
of diseases in developed countries. Most of the common diseases, such as 
heart disease, are multi-factored or polygenic, which is a phenotypic 
characteristic that involves two or more genes, and also involve 
environmental factors such as diet. In April 2010, scientists at Stanford 
University published the genome analysis of a healthy individual (Stephen 
Quake, a scientist at Stanford University, who had his genome sequenced); 
the analysis predicted his propensity to acquire various diseases. A risk 
assessment was performed to analyze Quake’s percentage of risk for 55 
different medical conditions. A rare genetic mutation was found, which 
showed him to be at risk for sudden heart attack. He was also predicted to 
have a 23 percent risk of developing prostate cancer and a 1.4 percent risk 
of developing Alzheimer’s. The scientists used databases and several 
publications to analyze the genomic data. Even though genomic sequencing 
is becoming more affordable and analytical tools are becoming more 
reliable, ethical issues surrounding genomic analysis at a population level 
remain to be addressed. 


Note: 
Art Connection 
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PCA3 is a gene that is expressed in 
prostate epithelial cells and 
overexpressed in cancerous cells. 
A high concentration of PCA3 in 
urine is indicative of prostate 
cancer. The PCA3 test is 
considered to be a better indicator 
of cancer than the more well know 
PSA test, which measures the level 
of PSA (prostate-specific antigen) 
in the blood. 


In 2011, the United States Preventative Services Task Force recommended 
against using the PSA test to screen healthy men for prostate cancer. Their 
recommendation is based on evidence that screening does not reduce the 
risk of death from prostate cancer. Prostate cancer often develops very 
slowly and does not cause problems, while the cancer treatment can have 
severe side effects. The PCA3 test is considered to be more accurate, but 
screening may still result in men who would not have been harmed by the 
cancer itself suffering side effects from treatment. What do you think? 
Should all healthy men be screened for prostate cancer using the PCA3 or 


PSA test? Should people in general be screened to find out if they have a 
genetic risk for cancer or other diseases? 


Pharmacogenomics and Toxicogenomics 


Pharmacogenomics, also called toxicogenomics, involves evaluating the 
effectiveness and safety of drugs on the basis of information from an 
individual's genomic sequence. Genomic responses to drugs can be studied 
using experimental animals (such as laboratory rats or mice) or live cells in 
the laboratory before embarking on studies with humans. Studying changes 
in gene expression could provide information about the transcription profile 
in the presence of the drug, which can be used as an early indicator of the 
potential for toxic effects. For example, genes involved in cellular growth 
and controlled cell death, when disturbed, could lead to the growth of 
cancerous cells. Genome-wide studies can also help to find new genes 
involved in drug toxicity. Personal genome sequence information can be 
used to prescribe medications that will be most effective and least toxic on 
the basis of the individual patient’s genotype. The gene signatures may not 
be completely accurate, but can be tested further before pathologic 
symptoms arise. 


Microbial Genomics: Metagenomics 


Traditionally, microbiology has been taught with the view that 
microorganisms are best studied under pure culture conditions, which 
involves isolating a single type of cell and culturing it in the laboratory. 
Because microorganisms can go through several generations in a matter of 
hours, their gene expression profiles adapt to the new laboratory 
environment very quickly. In addition, the vast majority of bacterial species 
resist being cultured in isolation. Most microorganisms do not live as 
isolated entities, but in microbial communities known as biofilms. For all of 
these reasons, pure culture is not always the best way to study 
microorganisms. Metagenomiics is the study of the collective genomes of 
multiple species that grow and interact in an environmental niche. 


Metagenomics can be used to identify new species more rapidly and to 
analyze the effect of pollutants on the environment ([link]). 


All the genomic DNA from a 
particular environment is 
cut into fragments and 
ligated into a cloning vector. 


Each color 
represents 
DNA from a 
different 
species. 


of overlap are used to 
determine the genomic 
sequences. 


The fragments are 
sequenced, and regions 


Metagenomics involves isolating 
DNA from multiple species within 
an environmental niche. 


Microbial Genomics: Creation of New Biofuels 


Knowledge of the genomics of microorganisms is being used to find better 
ways to harness biofuels from algae and cyanobacteria. The primary 
sources of fuel today are coal, oil, wood, and other plant products, such as 
ethanol. Although plants are renewable resources, there is still a need to 
find more alternative renewable sources of energy to meet our population’s 
energy demands. The microbial world is one of the largest resources for 
genes that encode new enzymes and produce new organic compounds, and 
it remains largely untapped. Microorganisms are used to create products, 
such as enzymes that are used in research, antibiotics, and other anti- 
microbial mechanisms. Microbial genomics is helping to develop diagnostic 


tools, improved vaccines, new disease treatments, and advanced 
environmental cleanup techniques. 


Mitochondrial Genomics 


Mitochondria are intracellular organelles that contain their own DNA. 
Mitochondrial DNA mutates at a rapid rate and is often used to study 
evolutionary relationships. Another feature that makes studying the 
mitochondrial genome interesting is that the mitochondrial DNA in most 
multicellular organisms is passed on from the mother during the process of 
fertilization. For this reason, mitochondrial genomics is often used to trace 
genealogy. 


Information and clues obtained from DNA samples found at crime scenes 
have been used as evidence in court cases, and genetic markers have been 
used in forensic analysis. Genomic analysis has also become useful in this 
field. In 2001, the first use of genomics in forensics was published. It was a 
collaborative attempt between academic research institutions and the FBI to 
solve the mysterious cases of anthrax communicated via the US Postal 
Service. Using microbial genomics, researchers determined that a specific 
strain of anthrax was used in all the mailings. 


Genomics in Agriculture 


Genomics can reduce the trials and failures involved in scientific research 
to a certain extent, which could improve the quality and quantity of crop 
yields in agriculture. Linking traits to genes or gene signatures helps to 
improve crop breeding to generate hybrids with the most desirable qualities. 
Scientists use genomic data to identify desirable traits, and then transfer 
those traits to a different organism. Scientists are discovering how genomics 
can improve the quality and quantity of agricultural production. For 
example, scientists could use desirable traits to create a useful product or 
enhance an existing product, such as making a drought-sensitive crop more 
tolerant of the dry season. 


Section Summary 


Imagination is the only barrier to the applicability of genomics. Genomics 
is being applied to most fields of biology; it is being used for personalized 
medicine, prediction of disease risks at an individual level, the study of 
drug interactions before the conduct of clinical trials, and the study of 
microorganisms in the environment as opposed to the laboratory. It is also 
being applied to developments such as the generation of new biofuels, 
genealogical assessment using mitochondria, advances in forensic science, 
and improvements in agriculture. 


Art Connections 


Exercise: 


Problem: 


[link] In 2011, the United States Preventative Services Task Force 
recommended against using the PSA test to screen healthy men for 
prostate cancer. Their recommendation is based on evidence that 
screening does not reduce the risk of death from prostate cancer. 
Prostate cancer often develops very slowly and does not cause 
problems, while the cancer treatment can have severe side effects. The 
PCAS3 test is considered to be more accurate, but screening may still 
result in men who would not have been harmed by the cancer itself 
suffering side effects from treatment. What do you think? Should all 
healthy men be screened for prostate cancer using the PCA3 or PSA 
test? Should people in general be screened to find out if they have a 
genetic risk for cancer or other diseases? 


Solution: 


[link] There are no right or wrong answers to these questions. While it 
is true that prostate cancer treatment itself can be harmful, many men 
would rather be aware that they have cancer so they can monitor the 
disease and begin treatment if it progresses. And while genetic 
screening may be useful, it is expensive and may cause needless worry. 
People with certain risk factors may never develop the disease, and 
preventative treatments may do more harm than good. 


Review Questions 


Exercise: 


Problem:Genomics can be used in agriculture to: 


a. generate new hybrid strains 
b. improve disease resistance 
c. improve yield 

d. all of the above 


Solution: 
D 
Exercise: 
Problem:Genomics can be used on a personal level to: 
a. decrease transplant rejection 
b. Predict genetic diseases that a person may have inherited 
c. Determine the risks of genetic diseases for an individual’s 


children 
d. All the above 


Solution: 


A 


Free Response 


Exercise: 


Problem: 


Explain why metagenomics is probably the most revolutionary 
application of genomics. 


Solution: 


Metagenomics is revolutionary because it replaced the practice of 
using pure cultures. Pure cultures were used to study individual species 
in the laboratory, but did not accurately represent what happens in the 
environment. Metagenomics studies the genomes of bacterial 
populations in their environmental niche. 


Exercise: 
Problem: 


How can genomics be used to predict disease risk and treatment 
options? 


Solution: 


Genomics can provide the unique DNA sequence of an individual, 
which can be used for personalized medicine and treatment options. 


Glossary 


metagenomics 
study of the collective genomes of multiple species that grow and 
interact in an environmental niche 


pharmacogenomics 
study of drug interactions with the genome or proteome; also called 
toxicogenomics 


polygenic 
phenotypic characteristic caused by two or more genes 


pure culture 
growth of a single type of cell in the laboratory 


Biotechnology 
By the end of this section, you will be able to: 


¢ Describe gel electrophoresis 
e Explain molecular and reproductive cloning 
¢ Describe uses of biotechnology in medicine and agriculture 


Biotechnology is the use of biological agents for technological 
advancement. Biotechnology was used for breeding livestock and crops 
long before the scientific basis of these techniques was understood. Since 
the discovery of the structure of DNA in 1953, the field of biotechnology 
has grown rapidly through both academic research and private companies. 
The primary applications of this technology are in medicine (production of 
vaccines and antibiotics) and agriculture (genetic modification of crops, 
such as to increase yields). Biotechnology also has many industrial 
applications, such as fermentation, the treatment of oil spills, and the 
production of biofuels ({link]). 


Antibiotics are chemicals produced by fungi, 
bacteria, and other organisms that have 
antimicrobial properties. The first antibiotic 
discovered was penicillin. Antibiotics are now 
commercially produced and tested for their 
potential to inhibit bacterial growth. (credit 
"advertisement": modification of work by NIH; 
credit "test plate": modification of work by Don 
Stalons/CDC; scale-bar data from Matt Russell) 


Basic Techniques to Manipulate Genetic Material (DNA and 
RNA) 


To understand the basic techniques used to work with nucleic acids, 
remember that nucleic acids are macromolecules made of nucleotides (a 
sugar, a phosphate, and a nitrogenous base) linked by phosphodiester 
bonds. The phosphate groups on these molecules each have a net negative 
charge. An entire set of DNA molecules in the nucleus is called the 
genome. DNA has two complementary strands linked by hydrogen bonds 
between the paired bases. The two strands can be separated by exposure to 
high temperatures (DNA denaturation) and can be reannealed by cooling. 
The DNA can be replicated by the DNA polymerase enzyme. Unlike DNA, 
which is located in the nucleus of eukaryotic cells, RNA molecules leave 
the nucleus. The most common type of RNA that is analyzed is the 
messenger RNA (mRNA) because it represents the protein-coding genes 
that are actively expressed. However, RNA molecules present some other 
challenges to analysis, as they are often less stable than DNA. 


DNA and RNA Extraction 


To study or manipulate nucleic acids, the DNA or RNA must first be 
isolated or extracted from the cells. Various techniques are used to extract 
different types of DNA ((link]). Most nucleic acid extraction techniques 
involve steps to break open the cell and use enzymatic reactions to destroy 
all macromolecules that are not desired (such as degradation of unwanted 
molecules and separation from the DNA sample). Cells are broken using a 
lysis buffer (a solution which is mostly a detergent); lysis means “to split.” 
These enzymes break apart lipid molecules in the cell membranes and 
nuclear membranes. Macromolecules are inactivated using enzymes such as 
proteases that break down proteins, and ribonucleases (RNAses) that 
break down RNA. The DNA is then precipitated using alcohol. Human 
genomic DNA is usually visible as a gelatinous, white mass. The DNA 
samples can be stored frozen at —-80°C for several years. 


DNA Extraction 
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Cells are lysed Cell contents Cell debris is The DNA is 
using a detergent are treated with pelleted ina precipitated 
that disrupts the protease to centrifuge. The with ethanol. 
plasma membrane. destroy protein, supernatant (liquid) It forms viscous 
and RNAase to containing the DNA strands that can 
destroy RNA. is transferred toa be spooled on 
clean tube. a glass rod. 


This diagram shows the basic method used for 
extraction of DNA. 


RNA analysis is performed to study gene expression patterns in cells. RNA 
is naturally very unstable because RNAses are commonly present in nature 
and very difficult to inactivate. Similar to DNA, RNA extraction involves 
the use of various buffers and enzymes to inactivate macromolecules and 
preserve the RNA. 


Gel Electrophoresis 


Because nucleic acids are negatively charged ions at neutral or basic pH in 
an aqueous environment, they can be mobilized by an electric field. Gel 
electrophoresis is a technique used to separate molecules on the basis of 
size, using this charge. The nucleic acids can be separated as whole 
chromosomes or fragments. The nucleic acids are loaded into a slot near the 
negative electrode of a semisolid, porous gel matrix and pulled toward the 
positive electrode at the opposite end of the gel. Smaller molecules move 
through the pores in the gel faster than larger molecules; this difference in 
the rate of migration separates the fragments on the basis of size. There are 


molecular weight standard samples that can be run alongside the molecules 
to provide a size comparison. Nucleic acids in a gel matrix can be observed 
using various fluorescent or colored dyes. Distinct nucleic acid fragments 
appear as bands at specific distances from the top of the gel (the negative 
electrode end) on the basis of their size ((link]). A mixture of genomic 
DNA fragments of varying sizes appear as a long smear, whereas uncut 
genomic DNA is usually too large to run through the gel and forms a single 
large band at the top of the gel. 


Shown are DNA fragments 
from seven samples run on a 
gel, stained with a fluorescent 

dye, and viewed under UV 

light. (credit: James Jacob, 
Tompkins Cortland Community 
College) 


Amplification of Nucleic Acid Fragments by Polymerase Chain 
Reaction 


Although genomic DNA is visible to the naked eye when it is extracted in 
bulk, DNA analysis often requires focusing on one or more specific regions 
of the genome. Polymerase chain reaction (PCR) is a technique used to 
amplify specific regions of DNA for further analysis ({link]). PCR is used 
for many purposes in laboratories, such as the cloning of gene fragments to 
analyze genetic diseases, identification of contaminant foreign DNA ina 
sample, and the amplification of DNA for sequencing. More practical 
applications include the determination of paternity and detection of genetic 
diseases. 
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The PCR cycle consists of three steps—denaturation, 


annealing, and DNA synthesis—that occur at high, low, So RB 
and intermediate temperatures, respectively. The cycle —  —— 
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Polymerase chain reaction, or PCR, is used to amplify a 
specific sequence of DNA. Primers—short pieces of DNA 
complementary to each end of the target sequence—are 
combined with genomic DNA, Taq polymerase, and 
deoxynucleotides. Taq polymerase is a DNA polymerase 
isolated from the thermostable bacterium Thermus 
aquaticus that is able to withstand the high temperatures 
used in PCR. Thermus aquaticus grows in the Lower 
Geyser Basin of Yellowstone National Park. Reverse 
transcriptase PCR (RI-PCR) is similar to PCR, but cDNA 
is made from an RNA template before PCR begins. 


DNA fragments can also be amplified from an RNA template in a process 
called reverse transcriptase PCR (RT-PCR). The first step is to recreate 
the original DNA template strand (called cDNA) by applying DNA 
nucleotides to the mRNA. This process is called reverse transcription. This 
requires the presence of an enzyme called reverse transcriptase. After the 
cDNA is made, regular PCR can be used to amplify it. 


Note: 
Link to Learning 
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Deepen your understanding of the polymerase chain reaction by clicking 
through this interactive exercise. 


Hybridization, Southern Blotting, and Northern Blotting 


Nucleic acid samples, such as fragmented genomic DNA and RNA extracts, 
can be probed for the presence of certain sequences. Short DNA fragments 
called probes are designed and labeled with radioactive or fluorescent dyes 
to aid detection. Gel electrophoresis separates the nucleic acid fragments 
according to their size. The fragments in the gel are then transferred onto a 
nylon membrane in a procedure called blotting ({link]). The nucleic acid 
fragments that are bound to the surface of the membrane can then be probed 
with specific radioactively or fluorescently labeled probe sequences. When 
DNA is transferred to a nylon membrane, the technique is called Southern 
blotting, and when RNA is transferred to a nylon membrane, it is called 
northern blotting. Southern blots are used to detect the presence of certain 
DNA sequences in a given genome, and northern blots are used to detect 
gene expression. 


Southern Blotting 
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The DNA is transferred 
from the agarose gel to 
a nylon membrane. 


Electrophoresis is used to 
separate DNA fragments by 
size. There can be so many 
fragments that they appear 
as a smear on the gel. 
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The membrane is bathed 
in a solution containing a 
probe, a short piece of 
DNA complementary to 
the sequence of interest. 


The probe is labeled or 
tagged with a fluorescent 
dye so that the location 
of DNA fragments to 
which it hybridizes can 
be visualized. 


Southern blotting is used to find a particular 
sequence in a sample of DNA. DNA fragments are 
separated on a gel, transferred to a nylon 
membrane, and incubated with a DNA probe 
complementary to the sequence of interest. 
Northern blotting is similar to Southern blotting, 
but RNA is run on the gel instead of DNA. In 
western blotting, proteins are run on a gel and 
detected using antibodies. 


Molecular Cloning 


In general, the word “cloning” means the creation of a perfect replica; 
however, in biology, the re-creation of a whole organism is referred to as 
“reproductive cloning.” Long before attempts were made to clone an entire 
organism, researchers learned how to reproduce desired regions or 
fragments of the genome, a process that is referred to as molecular cloning. 


Cloning small fragments of the genome allows for the manipulation and 
study of specific genes (and their protein products), or noncoding regions in 
isolation. A plasmid (also called a vector) is a small circular DNA molecule 
that replicates independently of the chromosomal DNA. In cloning, the 


plasmid molecules can be used to provide a "folder" in which to insert a 
desired DNA fragment. Plasmids are usually introduced into a bacterial host 
for proliferation. In the bacterial context, the fragment of DNA from the 
human genome (or the genome of another organism that is being studied) is 
referred to as foreign DNA, or a transgene, to differentiate it from the DNA 
of the bacterium, which is called the host DNA. 


Plasmids occur naturally in bacterial populations (such as Escherichia coli) 
and have genes that can contribute favorable traits to the organism, such as 
antibiotic resistance (the ability to be unaffected by antibiotics). Plasmids 
have been repurposed and engineered as vectors for molecular cloning and 
the large-scale production of important reagents, such as insulin and human 
growth hormone. An important feature of plasmid vectors is the ease with 
which a foreign DNA fragment can be introduced via the multiple cloning 
site (MCS). The MCS is a short DNA sequence containing multiple sites 
that can be cut with different commonly available restriction endonucleases. 
Restriction endonucleases recognize specific DNA sequences and cut 
them in a predictable manner; they are naturally produced by bacteria as a 
defense mechanism against foreign DNA. Many restriction endonucleases 
make staggered cuts in the two strands of DNA, such that the cut ends have 
a 2- or 4-base single-stranded overhang. Because these overhangs are 
capable of annealing with complementary overhangs, these are called 
“sticky ends.” Addition of an enzyme called DNA ligase permanently joins 
the DNA fragments via phosphodiester bonds. In this way, any DNA 
fragment generated by restriction endonuclease cleavage can be spliced 
between the two ends of a plasmid DNA that has been cut with the same 
restriction endonuclease ([link]). 


Recombinant DNA Molecules 


Plasmids with foreign DNA inserted into them are called recombinant 
DNA molecules because they are created artificially and do not occur in 
nature. They are also called chimeric molecules because the origin of 
different parts of the molecules can be traced back to different species of 
biological organisms or even to chemical synthesis. Proteins that are 
expressed from recombinant DNA molecules are called recombinant 


proteins. Not all recombinant plasmids are capable of expressing genes. 
The recombinant DNA may need to be moved into a different vector (or 
host) that is better designed for gene expression. Plasmids may also be 
engineered to express proteins only when stimulated by certain 
environmental factors, so that scientists can control the expression of the 
recombinant proteins. 


Note: 
Art Connection 


Molecular Cloning 
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Foreign DNA Plasmid The foreign DNA and plasmid are cut with the same 
LacZ gene restriction enzyme, which recognizes a particular 
sequence of DNA called a restriction site. The restriction 
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CO CS The ligated cloning vector is transformed into a bacterial 
host strain that is ampicillin sensitive and is missing the 
Bacteria (may take lacZ gene from its genome. 
up plasmid with or 
without the insert, 
or may not take up 


y plasmid at all). 


Bacteria are grown on media containing ampicillin and 
X-gal, a chemical that is metabolized by the same 
pathway as lactose. The ampicillin kills bacteria without 
plasmid. Plasmids lacking the foreign insert have an 
intact lacZ gene and are able to metabolize X-gal, 
releasing a dye that turns the colony blue. Plasmids with 
an insert have a disrupted /JacZ gene and produce white 
colonies. 
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Bacterial genome is 
missing the lacZ gene. 
White colonies 
have plasmids 
with the foreign 
insert. 


Blue colonies 
have plasmids 
without insert. 


This diagram shows the steps involved in molecular cloning. 


You are working in a molecular biology lab and, unbeknownst to you, your 
lab partner left the foreign genomic DNA that you are planning to clone on 
the lab bench overnight instead of storing it in the freezer. As a result, it 
was degraded by nucleases, but still used in the experiment. The plasmid, 
on the other hand, is fine. What results would you expect from your 
molecular cloning experiment? 


a. There will be no colonies on the bacterial plate. 
b. There will be blue colonies only. 

c. There will be blue and white colonies. 

d. The will be white colonies only. 


Note: 
Link to Learning 
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View an animation of recombination in cloning from the DNA Leaming 
Center. 


Cellular Cloning 


Unicellular organisms, such as bacteria and yeast, naturally produce clones 
of themselves when they replicate asexually by binary fission; this is known 
as cellular cloning. The nuclear DNA duplicates by the process of mitosis, 
which creates an exact replica of the genetic material. 


Reproductive Cloning 


Reproductive cloning is a method used to make a clone or an identical 
copy of an entire multicellular organism. Most multicellular organisms 
undergo reproduction by sexual means, which involves genetic 
hybridization of two individuals (parents), making it impossible for 
generation of an identical copy or a clone of either parent. Recent advances 
in biotechnology have made it possible to artificially induce asexual 
reproduction of mammals in the laboratory. 


Parthenogenesis, or “virgin birth,” occurs when an embryo grows and 
develops without the fertilization of the egg occurring; this is a form of 
asexual reproduction. An example of parthenogenesis occurs in species in 
which the female lays an egg and if the egg is fertilized, it is a diploid egg 
and the individual develops into a female; if the egg is not fertilized, it 
remains a haploid egg and develops into a male. The unfertilized egg is 
called a parthenogenic, or virgin, egg. Some insects and reptiles lay 
parthenogenic eggs that can develop into adults. 


Sexual reproduction requires two cells; when the haploid egg and sperm 
cells fuse, a diploid zygote results. The zygote nucleus contains the genetic 
information to produce a new individual. However, early embryonic 
development requires the cytoplasmic material contained in the egg cell. 
This idea forms the basis for reproductive cloning. Therefore, if the haploid 
nucleus of an egg cell is replaced with a diploid nucleus from the cell of any 
individual of the same species (called a donor), it will become a zygote that 
is genetically identical to the donor. Somatic cell nuclear transfer is the 
technique of transferring a diploid nucleus into an enucleated egg. It can be 
used for either therapeutic cloning or reproductive cloning. 


The first cloned animal was Dolly, a sheep who was born in 1996. The 
success rate of reproductive cloning at the time was very low. Dolly lived 
for seven years and died of respiratory complications ([link]). There is 
speculation that because the cell DNA belongs to an older individual, the 
age of the DNA may affect the life expectancy of a cloned individual. Since 
Dolly, several animals such as horses, bulls, and goats have been 
successfully cloned, although these individuals often exhibit facial, limb, 
and cardiac abnormalities. There have been attempts at producing cloned 
human embryos as sources of embryonic stem cells, sometimes referred to 


as Cloning for therapeutic purposes. Therapeutic cloning produces stem 
cells to attempt to remedy detrimental diseases or defects (unlike 
reproductive cloning, which aims to reproduce an organism). Still, 
therapeutic cloning efforts have met with resistance because of bioethical 
considerations. 


Note: 
Art Connection 
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Dolly the sheep was the first 
mammal to be cloned. To 
create Dolly, the nucleus was 
removed from a donor egg 
cell. The nucleus from a 
second sheep was then 
introduced into the cell, 
which was allowed to divide 
to the blastocyst stage before 
being implanted in a 
surrogate mother. (credit: 


modification of work by 
"Squidonius"/Wikimedia 
Commons) 


Do you think Dolly was a Finn-Dorset or a Scottish Blackface sheep? 


Genetic Engineering 


Genetic engineering is the alteration of an organism’s genotype using 
recombinant DNA technology to modify an organism’s DNA to achieve 
desirable traits. The addition of foreign DNA in the form of recombinant 
DNA vectors generated by molecular cloning is the most common method 
of genetic engineering. The organism that receives the recombinant DNA is 
called a genetically modified organism (GMO). If the foreign DNA that is 
introduced comes from a different species, the host organism is called 
transgenic. Bacteria, plants, and animals have been genetically modified 
since the early 1970s for academic, medical, agricultural, and industrial 
purposes. In the US, GMOs such as Roundup-ready soybeans and borer- 
resistant corn are part of many common processed foods. 


Gene Targeting 


Although classical methods of studying the function of genes began with a 
given phenotype and determined the genetic basis of that phenotype, 
modern techniques allow researchers to start at the DNA sequence level and 
ask: "What does this gene or DNA element do?" This technique, called 
reverse genetics, has resulted in reversing the classic genetic methodology. 
This method would be similar to damaging a body part to determine its 
function. An insect that loses a wing cannot fly, which means that the 
function of the wing is flight. The classical genetic method would compare 
insects that cannot fly with insects that can fly, and observe that the non- 
flying insects have lost wings. Similarly, mutating or deleting genes 
provides researchers with clues about gene function. The methods used to 


disable gene function are collectively called gene targeting. Gene targeting 
is the use of recombinant DNA vectors to alter the expression of a particular 
gene, either by introducing mutations in a gene, or by eliminating the 
expression of a certain gene by deleting a part or all of the gene sequence 
from the genome of an organism. 


Biotechnology in Medicine and Agriculture 


It is easy to see how biotechnology can be used for medicinal purposes. 
Knowledge of the genetic makeup of our species, the genetic basis of 
heritable diseases, and the invention of technology to manipulate and fix 
mutant genes provides methods to treat the disease. Biotechnology in 
agriculture can enhance resistance to disease, pest, and environmental 
stress, and improve both crop yield and quality. 


Genetic Diagnosis and Gene Therapy 


The process of testing for suspected genetic defects before administering 
treatment is called genetic diagnosis by genetic testing. Depending on the 
inheritance patterns of a disease-causing gene, family members are advised 
to undergo genetic testing. For example, women diagnosed with breast 
cancer are usually advised to have a biopsy so that the medical team can 
determine the genetic basis of cancer development. Treatment plans are 
based on the findings of genetic tests that determine the type of cancer. If 
the cancer is caused by inherited gene mutations, other female relatives are 
also advised to undergo genetic testing and periodic screening for breast 
cancer. Genetic testing is also offered for fetuses (or embryos with in vitro 
fertilization) to determine the presence or absence of disease-causing genes 
in families with specific debilitating diseases. 


Gene therapy is a genetic engineering technique used to cure disease. In its 
simplest form, it involves the introduction of a good gene at a random 
location in the genome to aid the cure of a disease that is caused by a 
mutated gene. The good gene is usually introduced into diseased cells as 
part of a vector transmitted by a virus that can infect the host cell and 


deliver the foreign DNA ((link]). More advanced forms of gene therapy try 
to correct the mutation at the original site in the genome, such as is the case 
with treatment of severe combined immunodeficiency (SCID). 
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Gene therapy using an adenovirus vector can be used to 
cure certain genetic diseases in which a person has a 
defective gene. (credit: NIH) 


Production of Vaccines, Antibiotics, and Hormones 


Traditional vaccination strategies use weakened or inactive forms of 
microorganisms to mount the initial immune response. Modern techniques 
use the genes of microorganisms cloned into vectors to mass produce the 
desired antigen. The antigen is then introduced into the body to stimulate 
the primary immune response and trigger immune memory. Genes cloned 


from the influenza virus have been used to combat the constantly changing 
strains of this virus. 


Antibiotics are a biotechnological product. They are naturally produced by 
microorganisms, such as fungi, to attain an advantage over bacterial 
populations. Antibiotics are produced on a large scale by cultivating and 
manipulating fungal cells. 


Recombinant DNA technology was used to produce large-scale quantities 
of human insulin in E. coli as early as 1978. Previously, it was only possible 
to treat diabetes with pig insulin, which caused allergic reactions in humans 
because of differences in the gene product. In addition, human growth 
hormone (HGH) is used to treat growth disorders in children. The HGH 
gene was cloned from a cDNA library and inserted into E. coli cells by 
cloning it into a bacterial vector. 


Transgenic Animals 


Although several recombinant proteins used in medicine are successfully 
produced in bacteria, some proteins require a eukaryotic animal host for 
proper processing. For this reason, the desired genes are cloned and 
expressed in animals, such as sheep, goats, chickens, and mice. Animals 
that have been modified to express recombinant DNA are called transgenic 
animals. Several human proteins are expressed in the milk of transgenic 
sheep and goats, and some are expressed in the eggs of chickens. Mice have 
been used extensively for expressing and studying the effects of 
recombinant genes and mutations. 


Transgenic Plants 


Manipulating the DNA of plants (i.e., creating GMOs) has helped to create 
desirable traits, such as disease resistance, herbicide and pesticide 
resistance, better nutritional value, and better shelf-life (({link]). Plants are 
the most important source of food for the human population. Farmers 
developed ways to select for plant varieties with desirable traits long before 
modern-day biotechnology practices were established. 


Corn, a major agricultural 
crop used to create 
products for a variety of 
industries, is often 
modified through plant 
biotechnology. (credit: 
Keith Weller, USDA) 


Plants that have received recombinant DNA from other species are called 
transgenic plants. Because they are not natural, transgenic plants and other 
GMOs are closely monitored by government agencies to ensure that they 
are fit for human consumption and do not endanger other plant and animal 
life. Because foreign genes can spread to other species in the environment, 
extensive testing is required to ensure ecological stability. Staples like corn, 
potatoes, and tomatoes were the first crop plants to be genetically 
engineered. 


Transformation of Plants Using Agrobacterium tumefaciens 


Gene transfer occurs naturally between species in microbial populations. 
Many viruses that cause human diseases, such as cancer, act by 
incorporating their DNA into the human genome. In plants, tumors caused 
by the bacterium Agrobacterium tumefaciens occur by transfer of DNA 
from the bacterium to the plant. Although the tumors do not kill the plants, 
they make the plants stunted and more susceptible to harsh environmental 
conditions. Many plants, such as walnuts, grapes, nut trees, and beets, are 
affected by A. tumefaciens. The artificial introduction of DNA into plant 
cells is more challenging than in animal cells because of the thick plant cell 
wall. 


Researchers used the natural transfer of DNA from Agrobacterium to a 
plant host to introduce DNA fragments of their choice into plant hosts. In 
nature, the disease-causing A. tumefaciens have a set of plasmids, called the 
Ti plasmids (tumor-inducing plasmids), that contain genes for the 
production of tumors in plants. DNA from the Ti plasmid integrates into the 
infected plant cell’s genome. Researchers manipulate the Ti plasmids to 
remove the tumor-causing genes and insert the desired DNA fragment for 
transfer into the plant genome. The Ti plasmids carry antibiotic resistance 
genes to aid selection and can be propagated in E. coli cells as well. 


The Organic Insecticide Bacillus thuringiensis 


Bacillus thuringiensis (Bt) is a bacterium that produces protein crystals 
during sporulation that are toxic to many insect species that affect plants. Bt 
toxin has to be ingested by insects for the toxin to be activated. Insects that 
have eaten Bt toxin stop feeding on the plants within a few hours. After the 
toxin is activated in the intestines of the insects, death occurs within a 
couple of days. Modern biotechnology has allowed plants to encode their 
own crystal Bt toxin that acts against insects. The crystal toxin genes have 
been cloned from Bt and introduced into plants. Bt toxin has been found to 
be safe for the environment, non-toxic to humans and other mammals, and 
is approved for use by organic farmers as a natural insecticide. 


Flavr Savr Tomato 


The first GM crop to be introduced into the market was the Flavr Savr 
Tomato produced in 1994. Antisense RNA technology was used to slow 
down the process of softening and rotting caused by fungal infections, 
which led to increased shelf life of the GM tomatoes. Additional genetic 
modification improved the flavor of this tomato. The Flavr Savr tomato did 
not successfully stay in the market because of problems maintaining and 
shipping the crop. 


Section Summary 


Nucleic acids can be isolated from cells for the purposes of further analysis 
by breaking open the cells and enzymatically destroying all other major 
macromolecules. Fragmented or whole chromosomes can be separated on 
the basis of size by gel electrophoresis. Short stretches of DNA or RNA can 
be amplified by PCR. Southern and northern blotting can be used to detect 
the presence of specific short sequences in a DNA or RNA sample. The 
term “cloning” may refer to cloning small DNA fragments (molecular 
cloning), cloning cell populations (cellular cloning), or cloning entire 
organisms (reproductive cloning). Genetic testing is performed to identify 
disease-causing genes, and gene therapy is used to cure an inheritable 
disease. 


Transgenic organisms possess DNA from a different species, usually 
generated by molecular cloning techniques. Vaccines, antibiotics, and 
hormones are examples of products obtained by recombinant DNA 
technology. Transgenic plants are usually created to improve characteristics 
of crop plants. 


Art Connections 


Exercise: 


Problem: 


[link] You are working in a molecular biology lab and, unbeknownst to 
you, your lab partner left the foreign genomic DNA that you are 
planning to clone on the lab bench overnight instead of storing it in the 
freezer. As a result, it was degraded by nucleases, but still used in the 
experiment. The plasmid, on the other hand, is fine. What results 
would you expect from your molecular cloning experiment? 


a. There will be no colonies on the bacterial plate. 
b. There will be blue colonies only. 

c. There will be blue and white colonies. 

d. The will be white colonies only. 


Solution: 


[link] B. The experiment would result in blue colonies only. 
Exercise: 


Problem: 


[link] Do you think Dolly was a Finn-Dorset or a Scottish Blackface 
sheep? 


Solution: 

[link] Dolly was a Finn-Dorset sheep because even though the original 
cell came from a Scottish blackface sheep and the surrogate mother 
was a Scottish blackface, the DNA came from a Finn-Dorset. 


Review Questions 


Exercise: 


Problem:GMOs are created by 


a. generating genomic DNA fragments with restriction 
endonucleases 

b. introducing recombinant DNA into an organism by any means 

c. overexpressing proteins in E. coli. 

d. all of the above 


Solution: 


B 
Exercise: 


Problem: 


Gene therapy can be used to introduce foreign DNA into cells 


a. for molecular cloning 

b. by PCR 

c. of tissues to cure inheritable disease 
d. all of the above 


Solution: 
C 
Exercise: 
Problem: Insulin produced by molecular cloning: 
a. is of pig origin 
b. is a recombinant protein 


c. is made by the human pancreas 
d. is recombinant DNA 


Solution: 


B 
Exercise: 
Problem: Bt toxin is considered to be 
a. a gene for modifying insect DNA 
b. an organic insecticide produced by bacteria 


c. useful for humans to fight against insects 
d. a recombinant protein 


Solution: 


B 


Exercise: 


Problem:The Flavr Savr Tomato: 


a. is a variety of vine-ripened tomato in the supermarket 
b. was created to have better flavor and shelf-life 

c. does not undergo soft rot 

d. all of the above 


Solution: 


D 


Free Response 
Exercise: 
Problem: Describe the process of Southern blotting. 


Solution: 


Southern blotting is the transfer of DNA that has been enzymatically 
cut into fragments and run on an agarose gel onto a nylon membrane. 
The DNA fragments that are on the nylon membrane can be denatured 
to make them single-stranded, and then probed with small DNA 
fragments that are radioactively or fluorescently labeled, to detect the 
presence of specific sequences. An example of the use of Southern 
blotting would be in analyzing the presence, absence, or variation of a 
disease gene in genomic DNA from a group of patients. 


Exercise: 
Problem: 


A researcher wants to study cancer cells from a patient with breast 
cancer. Is cloning the cancer cells an option? 


Solution: 
Cellular cloning of the breast cancer cells will establish a cell line, 
which can be used for further analysis 
Exercise: 
Problem: 


How would a scientist introduce a gene for herbicide resistance into a 
plant? 


Solution: 


By identifying an herbicide resistance gene and cloning it into a plant 
expression vector system, like the Ti plasmid system from 
Agrobacterium tumefaciens. The scientist would then introduce it into 
the plant cells by transformation, and select cells that have taken up 
and integrated the herbicide-resistance gene into the genome. 


Exercise: 


Problem: 


If you had a chance to get your genome sequenced, what are some 
questions you might be able to have answered about yourself? 


Solution: 


What diseases am I prone to and what precautions should I take? Am I 
a carrier for any disease-causing genes that may be passed on to 
children? 


Glossary 


antibiotic resistance 
ability of an organism to be unaffected by the actions of an antibiotic 


biotechnology 
use of biological agents for technological advancement 


cellular cloning 
production of identical cell populations by binary fission 


clone 
exact replica 


foreign DNA 
DNA that belongs to a different species or DNA that is artificially 
synthesized 


gel electrophoresis 
technique used to separate molecules on the basis of size using electric 
charge 


gene targeting 
method for altering the sequence of a specific gene by introducing the 
modified version on a vector 


gene therapy 
technique used to cure inheritable diseases by replacing mutant genes 
with good genes 


genetic diagnosis 
diagnosis of the potential for disease development by analyzing 
disease-causing genes 


genetic engineering 
alteration of the genetic makeup of an organism 


genetic testing 
process of testing for the presence of disease-causing genes 


genetically modified organism (GMO) 
organism whose genome has been artificially changed 


host DNA 
DNA that is present in the genome of the organism of interest 


lysis buffer 
solution used to break the cell membrane and release cell contents 


molecular cloning 
cloning of DNA fragments 


multiple cloning site (MCS) 
site that can be recognized by multiple restriction endonucleases 


northern blotting 
transfer of RNA from a gel to a nylon membrane 


polymerase chain reaction (PCR) 
technique used to amplify DNA 


probe 
small DNA fragment used to determine if the complementary sequence 
is present ina DNA sample 


protease 
enzyme that breaks down proteins 


recombinant DNA 
combination of DNA fragments generated by molecular cloning that 
does not exist in nature; also known as a chimeric molecule 


recombinant protein 
protein product of a gene derived by molecular cloning 


reproductive cloning 
cloning of entire organisms 


restriction endonuclease 
enzyme that can recognize and cleave specific DNA sequences 


reverse genetics 
method of determining the function of a gene by starting with the gene 
itself instead of starting with the gene product 


reverse transcriptase PCR (RT-PCR) 
PCR technique that involves converting RNA to DNA by reverse 
transcriptase 


ribonuclease 
enzyme that breaks down RNA 


Southern blotting 
transfer of DNA from a gel to a nylon membrane 


Ti plasmid 
plasmid system derived from Agrobacterium tumifaciens that has been 
used by scientists to introduce foreign DNA into plant cells 


transgenic 
organism that receives DNA from a different species 


Determining Evolutionary Relationships 
By the end of this section, you will be able to: 


¢ Compare homologous and analogous traits 
e Discuss the purpose of cladistics 
e Describe maximum parsimony 


Scientists must collect accurate information that allows them to make 
evolutionary connections among organisms. Similar to detective work, 
scientists must use evidence to uncover the facts. In the case of phylogeny, 
evolutionary investigations focus on two types of evidence: morphologic 
(form and function) and genetic. 


Two Options for Similarities 


In general, organisms that share similar physical features and genomes tend 
to be more closely related than those that do not. Such features that overlap 
both morphologically (in form) and genetically are referred to as 
homologous structures; they stem from developmental similarities that are 
based on evolution. For example, the bones in the wings of bats and birds 
have homologous structures ({link]). 


Homologous Structures 
S 


(a) Bird wing (b) Bat wing 


Bat and bird wings are homologous structures, 
indicating that bats and birds share a common 
evolutionary past. (credit a: modification of work 


by Steve Hillebrand, USFWS; credit b: 
modification of work by U.S. DOI BLM) 


Notice it is not simply a single bone, but rather a grouping of several bones 
arranged in a similar way. The more complex the feature, the more likely 
any kind of overlap is due to a common evolutionary past. Imagine two 
people from different countries both inventing a car with all the same parts 
and in exactly the same arrangement without any previous or shared 
knowledge. That outcome would be highly improbable. However, if two 
people both invented a hammer, it would be reasonable to conclude that 
both could have the original idea without the help of the other. The same 
relationship between complexity and shared evolutionary history is true for 
homologous structures in organisms. 


Misleading Appearances 


Some organisms may be very closely related, even though a minor genetic 
change caused a major morphological difference to make them look quite 
different. Similarly, unrelated organisms may be distantly related, but 
appear very much alike. This usually happens because both organisms were 
in common adaptations that evolved within similar environmental 
conditions. When similar characteristics occur because of environmental 
constraints and not due to a close evolutionary relationship, it is called an 
analogy or homoplasy. For example, insects use wings to fly like bats and 
birds, but the wing structure and embryonic origin is completely different. 
These are called analogous structures ((link]). 


Similar traits can be either homologous or analogous. Homologous 
structures share a similar embryonic origin; analogous organs have a similar 
function. For example, the bones in the front flipper of a whale are 
homologous to the bones in the human arm. These structures are not 
analogous. The wings of a butterfly and the wings of a bird are analogous 
but not homologous. Some structures are both analogous and homologous: 
the wings of a bird and the wings of a bat are both homologous and 


analogous. Scientists must determine which type of similarity a feature 
exhibits to decipher the phylogeny of the organisms being studied. 


(a) Bat wing (b) Bird wing 


(c) Insect wing 


The (c) wing of a honeybee is similar in shape to a 
(b) bird wing and (a) bat wing, and it serves the 
same function. However, the honeybee wing is not 
composed of bones and has a distinctly different 
structure and embryonic origin. These wing types 
(insect versus bat and bird) illustrate an analogy— 
similar structures that do not share an evolutionary 
history. (credit a: modification of work by Steve 
Hillebrand, USFWS; credit b: modification of 
work by U.S. DOI BLM; credit c: modification of 
work by Jon Sullivan) 


Note: 
Link to Learning 


, 


—- 
meee OPENStAX COLLEGE 


This website has several examples to show how appearances can be 
misleading in understanding the phylogenetic relationships of organisms. 


Molecular Comparisons 


With the advancement of DNA technology, the area of molecular 
systematics, which describes the use of information on the molecular level 
including DNA analysis, has blossomed. New computer programs not only 
confirm many earlier classified organisms, but also uncover previously 
made errors. As with physical characteristics, even the DNA sequence can 
be tricky to read in some cases. For some situations, two very closely 
related organisms can appear unrelated if a mutation occurred that caused a 
shift in the genetic code. An insertion or deletion mutation would move 
each nucleotide base over one place, causing two similar codes to appear 
unrelated. 


Sometimes two segments of DNA code in distantly related organisms 
randomly share a high percentage of bases in the same locations, causing 
these organisms to appear closely related when they are not. For both of 
these situations, computer technologies have been developed to help 
identify the actual relationships, and, ultimately, the coupled use of both 
morphologic and molecular information is more effective in determining 
phylogeny. 


Note: 


Evolution Connection 

Why Does Phylogeny Matter? 

Evolutionary biologists could list many reasons why understanding 
phylogeny is important to everyday life in human society. For botanists, 
phylogeny acts as a guide to discovering new plants that can be used to 
benefit people. Think of all the ways humans use plants—food, medicine, 
and clothing are a few examples. If a plant contains a compound that is 
effective in treating cancer, scientists might want to examine all of the 
relatives of that plant for other useful drugs. 

A research team in China identified a segment of DNA thought to be 
common to some medicinal plants in the family Fabaceae (the legume 
family) and worked to identify which species had this segment ((Link]). 
After testing plant species in this family, the team found a DNA marker (a 
known location on a chromosome that enabled them to identify the 
species) present. Then, using the DNA to uncover phylogenetic 
relationships, the team could identify whether a newly discovered plant 
was in this family and assess its potential medicinal properties. 


XXIV, 


Dalbergia Sissoo, Aoxb 


Dalbergia sissoo (D. sissoo) is in 
the Fabaceae, or legume family. 
Scientists found that D. sissoo 
shares a DNA marker with species 
within the Fabaceae family that 
have antifungal properties. 
Subsequently, D. sissoo was 
shown to have fungicidal activity, 
supporting the idea that DNA 
markers can be used to screen for 
plants with potential medicinal 
properties. 


Building Phylogenetic Trees 


How do scientists construct phylogenetic trees? After the homologous and 
analogous traits are sorted, scientists often organize the homologous traits 
using a system called cladistics. This system sorts organisms into clades: 
groups of organisms that descended from a single ancestor. For example, in 
[link], all of the organisms in the orange region evolved from a single 
ancestor that had amniotic eggs. Consequently, all of these organisms also 
have amniotic eggs and make a single clade, also called a monophyletic 
group. Clades must include all of the descendants from a branch point. 


Note: 
Art Connection 


Lancelet Lamprey Fish Lizard Rabbit Human 


Lizards, rabbits, and humans all 
descend from a common ancestor that 
had an amniotic egg. Thus, lizards, 
rabbits, and humans all belong to the 
clade Amniota. Vertebrata is a larger 
clade that also includes fish and 
lamprey. 


Which animals in this figure belong to a clade that includes animals with 
hair? Which evolved first, hair or the amniotic egg? 


Clades can vary in size depending on which branch point is being 
referenced. The important factor is that all of the organisms in the clade or 
monophyletic group stem from a single point on the tree. This can be 
remembered because monophyletic breaks down into “mono,” meaning 
one, and “phyletic,’ meaning evolutionary relationship. [link] shows 
various examples of clades. Notice how each clade comes from a single 
point, whereas the non-clade groups show branches that do not share a 


single point. 


Note: 
Art Connection 


Clades 


Slime Slime 
Entamoebae molds Animals Entamoebae molds Animals 
Fungi Fungi 


Plants Plants 
Ciliates Ciliates 
Flagellates Flagellates 
Trichomonads Trichomonads 
Microsporidia Microsporidia 
Diplomonads Diplomonads 


Not Clades 


Slime Slime 
Entamoebae molds Animals Entamoebae molds Animals 
Fungi Fungi 


Plants Plants 
Ciliates Ciliates 


Flagellates Flagellates 


Trichomonads 
Microsporidia 
Diplomonads 


Trichomonads 
Microsporidia 
Diplomonads 


All the organisms within a clade stem from a 
single point on the tree. A clade may contain 
multiple groups, as in the case of animals, 
fungi and plants, or a single group, as in the 
case of flagellates. Groups that diverge at a 
different branch point, or that do not include 
all groups in a single branch point, are not 
considered clades. 


What is the largest clade in this diagram? 


Shared Characteristics 


Organisms evolve from common ancestors and then diversify. Scientists use 
the phrase “descent with modification” because even though related 
organisms have many of the same characteristics and genetic codes, 
changes occur. This pattern repeats over and over as one goes through the 
phylogenetic tree of life: 


1. A change in the genetic makeup of an organism leads to a new trait 
which becomes prevalent in the group. 

2. Many organisms descend from this point and have this trait. 

3. New variations continue to arise: some are adaptive and persist, 
leading to new traits. 

4, With new traits, a new branch point is determined (go back to step 1 
and repeat). 


If a characteristic is found in the ancestor of a group, it is considered a 
shared ancestral character because all of the organisms in the taxon or 
clade have that trait. The vertebrate in [link] is a shared ancestral character. 
Now consider the amniotic egg characteristic in the same figure. Only some 
of the organisms in [link] have this trait, and to those that do, it is called a 
shared derived character because this trait derived at some point but does 
not include all of the ancestors in the tree. 


The tricky aspect to shared ancestral and shared derived characters is the 
fact that these terms are relative. The same trait can be considered one or 
the other depending on the particular diagram being used. Returning to 
[link], note that the amniotic egg is a shared ancestral character for the 
Amniota clade, while having hair is a shared derived character for some 
organisms in this group. These terms help scientists distinguish between 
clades in the building of phylogenetic trees. 


Choosing the Right Relationships 


Imagine being the person responsible for organizing all of the items ina 
department store properly—an overwhelming task. Organizing the 
evolutionary relationships of all life on Earth proves much more difficult: 
scientists must span enormous blocks of time and work with information 
from long-extinct organisms. Trying to decipher the proper connections, 
especially given the presence of homologies and analogies, makes the task 
of building an accurate tree of life extraordinarily difficult. Add to that the 
advancement of DNA technology, which now provides large quantities of 
genetic sequences to be used and analyzed. Taxonomy is a subjective 


discipline: many organisms have more than one connection to each other, so 
each taxonomist will decide the order of connections. 


To aid in the tremendous task of describing phylogenies accurately, 
scientists often use a concept called maximum parsimony, which means 
that events occurred in the simplest, most obvious way. For example, if a 
group of people entered a forest preserve to go hiking, based on the 
principle of maximum parsimony, one could predict that most of the people 
would hike on established trails rather than forge new ones. 


For scientists deciphering evolutionary pathways, the same idea is used: the 
pathway of evolution probably includes the fewest major events that 
coincide with the evidence at hand. Starting with all of the homologous 
traits in a group of organisms, scientists look for the most obvious and 
simple order of evolutionary events that led to the occurrence of those traits. 


Note: 
Link to Learning 


Head to this website to learn how maximum parsimony is used to create 
phylogenetic trees. 

These tools and concepts are only a few of the strategies scientists use to 
tackle the task of revealing the evolutionary history of life on Earth. 
Recently, newer technologies have uncovered surprising discoveries with 
unexpected relationships, such as the fact that people seem to be more 
closely related to fungi than fungi are to plants. Sound unbelievable? As 
the information about DNA sequences grows, scientists will become closer 
to mapping the evolutionary history of all life on Earth. 


Section Summary 


To build phylogenetic trees, scientists must collect accurate information that 
allows them to make evolutionary connections between organisms. Using 
morphologic and molecular data, scientists work to identify homologous 
characteristics and genes. Similarities between organisms can stem either 
from shared evolutionary history (homologies) or from separate 
evolutionary paths (analogies). Newer technologies can be used to help 
distinguish homologies from analogies. After homologous information is 
identified, scientists use cladistics to organize these events as a Means to 
determine an evolutionary timeline. Scientists apply the concept of 
maximum parsimony, which states that the order of events probably 
occurred in the most obvious and simple way with the least amount of 
steps. For evolutionary events, this would be the path with the least number 
of major divergences that correlate with the evidence. 


Art Connections 


Exercise: 


Problem: 


[link] Which animals in this figure belong to a clade that includes 
animals with hair? Which evolved first, hair or the amniotic egg? 


Solution: 
[link] Rabbits and humans belong in the clade that includes animals 


with hair. The amniotic egg evolved before hair because the Amniota 
clade is larger than the clade that encompasses animals with hair. 


Exercise: 


Problem: [link] What is the largest clade in this diagram? 
Solution: 


[link] The largest clade encompasses the entire tree. 


Review Questions 


Exercise: 


Problem: Which statement about analogies is correct? 


a. They occur only as errors. 

b. They are synonymous with homologous traits. 

c. They are derived by similar environmental constraints. 
d. They are a form of mutation. 


Solution: 


iG 


Exercise: 


Problem: What do scientists use to apply cladistics? 


a. homologous traits 

b. homoplasies 

c. analogous traits 

d. monophyletic groups 


Solution: 


A 
Exercise: 


Problem: 
What is true about organisms that are a part of the same clade? 


a. They all share the same basic characteristics. 
b. They evolved from a shared ancestor. 


c. They usually fall into the same classification taxa. 
d. They have identical phylogenies. 


Solution: 


B 
Exercise: 


Problem: 
Why do scientists apply the concept of maximum parsimony? 


a. to decipher accurate phylogenies 

b. to eliminate analogous traits 

c. to identify mutations in DNA codes 
d. to locate homoplasies 


Solution: 


A 


Free Response 


Exercise: 
Problem: 


Dolphins and fish have similar body shapes. Is this feature more likely 
a homologous or analogous trait? 


Solution: 


Dolphins are mammals and fish are not, which means that their 
evolutionary paths (phylogenies) are quite separate. Dolphins probably 
adapted to have a similar body plan after returning to an aquatic 
lifestyle, and, therefore, this trait is probably analogous. 


Exercise: 
Problem: 
Why is it so important for scientists to distinguish between 


homologous and analogous characteristics before building 
phylogenetic trees? 


Solution: 


Phylogenetic trees are based on evolutionary connections. If an 
analogous similarity were used on a tree, this would be erroneous and, 
furthermore, would cause the subsequent branches to be inaccurate. 


Exercise: 


Problem: Describe maximum parsimony. 
Solution: 


Maximum parsimony hypothesizes that events occurred in the 
simplest, most obvious way, and the pathway of evolution probably 
includes the fewest major events that coincide with the evidence at 
hand. 


Glossary 


analogy 
(also, homoplasy) characteristic that is similar between organisms by 
convergent evolution, not due to the same evolutionary path 


cladistics 
system used to organize homologous traits to describe phylogenies 


maximum parsimony 
applying the simplest, most obvious way with the least number of 
steps 


molecular systematics 
technique using molecular evidence to identify phylogenetic 
relationships 


monophyletic group 
(also, clade) organisms that share a single ancestor 


shared ancestral character 
describes a characteristic on a phylogenetic tree that is shared by all 
organisms on the tree 


shared derived character 
describes a characteristic on a phylogenetic tree that is shared only by 
a certain clade of organisms 
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